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ABSTRACT 

This  report  reviews  theoretical  and  empirical  studies  of 
decision  making.  The  purpose  of  the  review  was  to  identify  results 
that  would  be  applicable  to  the  problem  of  training  decision  makers. 

The  literature  on  decision  making  is  extensive.  However, 
relatively  few  studies  have  dealt  explicitly  with  the  problem  of 
training  in  decision-making  skills.  The  task,  therefore,  was  to 
gather  from  the  general  literature  on  decision  making  any  impli- 
cations that  could  be  found  for  training.* 

Decision  making  is  conceptualized  here  as  a type  of  problem 
solving,  and  the  review  is  organized  in  terms  of  the  following 
component  tasks:  information  gathering,  data  evaluation,  problem 

structuring,  hypothesis  generation,  hypothesis  evaluation,  pref- 
erence specification,  action  selection,  and  decision  evaluation. 
Implications  of  research  findings  for  training  are  discussed  in 
the  context  of  descriptions  of  each  of  these  tasks. 

A general  conclusion  drawn  from  the  study  is  that  decision 
making  is  probably  not  sufficiently  well  understood  to  permit  the 
design  of  an  effective  general-purpose  training  system  for  decision 
makers.  Systems  end  programs  could  be  developed,  however,  to 
facilitate  training  with  respect  to  specific  decision-making  skills. 
-The  development  of  more  generally  applicable  training  techniques 
or  systems  should  proceed  in  an  evolutionary  fashion. 

Training  is  one  way  to  improve  decision-making  performance; 
another  is  to  provide  the  decision  maker  with  aids  for  various 
aspects  of  his  task.  Because  training  and  the  provision  of  decision 
aids  are  viewed  as  complementary  approaches  to  the  same  problem, 
the  report  ends  with  a discussion  of  several  decision-aiding  tech- 
niques that  are  in  one  or  another  stage  of  study  or  development. 
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decision  making  is  conceptualized  here  as  a type  of  problem  solving,  and  the 
review  is  organized  in  terms  of  the  following  component  tasks:  information 

gathering,  data  evaluation,  problem  structuring,  hypothesis  generation,  hypothe- 
sis evaluation,  preference  specification,  action  selection,  and  decision  eval- 
uation. Implications  of  research  findings  for  training  are  discussed  in  the 
context  of  descriptions  of  each  of  these  tasks. 

A general  conclusion  drawn  from  the  study  is  that  decision  making  is  probably 
not  sufficiently  well  understood  to  permit  the  design  of  an  effective  general- 
purpose  training  system  for  decision  makers.  Systems  and  programs  could  be 
developed,  however,  to  facilitate  training  with  respect  to  specific  decision- 
making skills.  The  development  of  more  generally  applicable  training  tech- 
niques or  systems  should  proceed  in  an  evolutionary  fashion. 

Training  is  one  way  to  Improve  decision-making  performance;  another  is  to 
provide  the  decision  maker  with  aids  for  various  aspects  of  his  task.  Because 
training  and  the  provision  of  decision  aids  are  viewed  as  complementary  ap- 
proaches to  the  same  problem,  the  report  ends  with  a discussion  of  several 
decision-aiding  techniques  that  are  in  one  or  another  stage  of  study  or 
development.^ 
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j ! FOREWORD 

I I; 

The  Human  Factors  Laboratory  of  the  Naval  Training  Equipment 
j.  Center  has  been  involved  in  decision-making  research  with  the 

objective  of  developing  an  approach  to  decision-making  training 
i / which  will  improve  the  decision-making  and  tactical  performance 

/■  capabilities  of  Navy  commanders.  This  report  is  the  result  of  an 

Vi]  analytical  review  of  decision-making  research  which  was  performed 

to  identify  information  pertinent  to  the  training  of  decision- 
making skills. 

The  outcome  of  this  effort  corroborated  an  impression  that 
very  little  of  the  great  amount  of  decision-making  research  has 
directly  addressed  the  problem  of  training  in  decision  making, 
r?  The  review  has  identified  implications  for  the  training  of  decision 

makers  and  areas  for  research  which  could  provide  insight  for  the 
development  of  effective  training  procedures  and  programs. 


WILLIAM  P.  LANE 
Acquisition  Director 
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SECTION  I 

INTRODUCTION  ' 

Much  has  been  written  about  the  importance,  of  decision  making 
for  industry,  for  government,  for  the  military  and  for  rational — or 
at  least  reasonable — people  in  general.  Moreover,  a great  deal  of 
research  has  been  conducted  on  decision-making  behavior.  In  spite 
of  these  facts — or  perhaps  because  of  them — there  is  not  general  - 

agreement  concerning  what  decision  making  is,  how  it  should  be  done, 
how  it  is  done,  how  to  tell  whether  it  is  done  well  or  poorly,  and 
how  to  train  people  to  do  it  better.  1 

The  term  "decision  making"  has  been  applied  to  a very  broad  ; 

range  of  behaviors.  The  detection  of  weak  sensory  stimuli  has  been 
viewed,  in  part,  as  a decision  process  (Green  & Swets , 1966),  as 
has  perception  by  humans  more  generally  (Bruner,  1957).  Pattern 
classification  by  machines  (Sebestyn?  1962);  the  retrieval  of 
information  from  memory  (Egan,  1958),  the  performance  of  skilled 
tasks  such  as  automobile  driving  (Algea,  1964)  and  airplane  pilot- 
ing (Szafran,  1970),  the  production  of  speech  (Rochester  & Gill,  * 

1973),  educational  counseling  (Stewart  & Winborn,  1973),  the  pur- 
chasing of  industrial  products  (Reinqen,  1973),  the  evaluation  of 
the  performanc  i of  salesmen  (Sheridan  & Carlson,  1972),  and  the 
conducting  of  a laboratory  experiment  (Edwards,  1956)  are  also 
representative  of  the  types  of  processes  that  have  been  discussed 
under  the  rubric  of  decision  making.  Probably  when  the  term  is  used 
in  industrial,  governmental  and  military  contexts,  however,  what  ! 

the  user  has  in  mind  is  something  close  to  what  Schrenk  (1969)  i 

describes  as  'situations  characterized  by  fairly  well-defined  1 

objectives,  significant  action  alternatives,  relatively  high 
stakes,  inconclusive  information  and  limited  time  for  decision" 

(p.  544).  We  hasten  to  add  that  to  limit  one's  attention  to 
situations  that  have  all  of  these  characteristics  would  preclude 
consideration  of  the  large,  majority  of  experimental  investigations 
of  decision  making?  in  particular,  in  very  few  laboratory  studies  , 

of  decision  making  have  the  stakes  been  high;  and  one  may  question 
in  many — if  not  most— cases  the  significance  of  the  action  alter- 
natives to  the  experimental  subjects.  It  does  not  necessarily 
follow  that  the  results  of  laboratory  studies  have  no  relevance  to 
real-life  decision  making,  of:  course.  The  degree  to  which  one  is 
willing  to  extrapolate  from  the  one  situation  to  the  other  depends 
on  the  extent  to  which  one  subscribes  to  the  view  that  simple  and 
inconsequential  decision  problems  are  ?olved--at  least  in  principle 
--in  the  same  ways  as  are  those  that  are  complex  and  consequential. 

As  Schrenk (1969 ) has  pointed  out,  there  are  three  ways  to 
improve  the  performance  of  the  human  decision  element  in  a system: 

(1)  selection  (insure  that  decisions  are  made  only  by  individuals 
who  are  competent  to  make  them) , (2)  training  (attempt  to  improve  i 

) the  decision-related  skills  of  people  in  decision-making  positions) , 


i 
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and  (3)  decision  aiding  (provide  decision  makers  with  procedural 
and  technical  aids  to  compensate  for  their  own  limitations) . To 
the  extent  that  performance  of  a decision-making  system  is  of  in- 
terest, as  opposed  to  that  of  a human  being,  another  possibility 
that  deserves  consideration  is  that  of  automation  (have  machines 
perform  those  decision  tasks  that  they  can  perform  better  than 
people) . 

The  number  of  tasks  that  are  now  performed  by  machines  that 
were  once  thought  to  require  human  skills  is  growing  and  will 
continue  to  do  so.  Many  tasks  that  involve  decision  making  by 
some  definition  should  be — indeed,  many  have  been — automated. 
There  is  little  justification  for  wasting  a good  human  brain  to 
make  what  Soelberg  (1967)  calls  "programmed  decisions,"  decisions 
that  are  made  with  sufficient  frequency  and  under  sufficiently 
specifiable  conditions  to  permit  the  detailed  description  of  pro- 
cedures for  making  them.  Thermostats,  governors,  regulators, 
stabilizers,  computer  algorithms,  and  such  things,  are  the  pre- 
ferred "decision  makers"  for  these  types  of  situations.  The 
situations  with  which  we  are  primarily  concerned  are  not  of  this 
straightforward  programmed  type.  They  are  situations  that  are 
novel,  unstructured  or  unplanned  for,  or  they  involve  human  pref- 
erences that  are  not  easily  specified,  or  potential  action  con- 
sequences that  are  not  known  with  certainty.  Clearly,  these 
types  of  situations  are  the  more  interesting  objects  of  study, 
and  are  probably  more  representative  of  what  people  view  as  bona 
fide  decision  making. 

It  is  important  to  recognize  that  the  objectives  of  much 
decision-making  research  are  to  make  novel  situations  less  novel 
by  providing  prototypes  in  terms  of  which  the  novel  situations 
can  be  perceived,  to  facilitate  the  imposition  of  structure  on 
situations  when  apparent  structure  is  lacking,  and  to  provide 
techniques  for  decreasing  the  probability  of  surprises  and  for 
coping  with  unplanned-for  situations  as  though  they  had  been 
anticipated  all  along.  But  the  reader  who  might  think  that  such 
objectives  could,  if  realized,  take  the  charm  out  of  decision 
making  may  rest  easy.  There  seems  little  danger  of  success  to 
the  point  of  reducing  all  decision  making  to  an  algorithmic 
process  in  the  near  future.  Indeed,  there  are  some  aspects  of 
decision  making  that  men  may  never  feel  comfortable  turning  over 
to  machines.  Hence,  the  needs  for  selection,  training  and  de- 
cision tiding  are  still  real,  and  are  likely  to  continue  to  be 
for  some  time  to  come.  Moreover,  as  more  and  more  of  the  pro- 
cedurizable  tasks  chat  were  once  performed  by  men  do  become  auto- 
mated, the  tasks  that  are  left  to  be  performed  by  men— or  perhaps 
by  men  and  machines  in  collaboration — take  on  added  interest  and 
significance  by  virtue  of  their  very  resistance  to  automation. 
Should  not  those  tasks  which  seem  to  require  the  attention  of 
human  brains  be  the  tasks  that  hold  a unique  fascination  for  us 
as  human  beings? 
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The  general  question  that  motivates  this  study  is  the  question 
of  whether  individuals  can  be  trained  to  be  effective  decision" 
makers  in  unprogrammed  situations.  And  if  the  answer  to  that 
question  appears  to  be  yes,  the  next  question  that  presents  itself 
is  that  of  how  that  training  can  be  accomplished  most  effectively. 
Immediately,  one  is  led  to  more  specific  questions.  Does  it  make' 
sense  to  think  of  decision  making  as  a skill,  or  as  a collection 
of  skills,  that  can  be  developed  in  a sufficiently  general  way  that 
they  can  be  applied  in  a variety  of  specific  contexts?  What  is  it 
that  the  decision  maker  needs  to  be  taught?  Concepts?  Facts? 
Principles?  Attitudes?  Procedures?  Heuristics? 

The  literature  on  decision-making  research  is  volumious,  but 
despite  numerous  references  to  the  importance  of  the  training  of 
decision  makers  (e.q.,  Edwards,  1962;  Evans  & Cody,  1969;  Fleming, 
1970;  Hammell  & Mara,  1970;  Kanarick,  1969;  Kepner  & Tregoe,  1965; 
Scalzi,  1970;  Sidorsky  & Simoneau,  1970),  the  number  of  studies 
that  have  explicitly  addressed  the  question  of  exactly  what  should 
be  taught  and  how  the  teaching  can  best  be  accomplished  is  remark- 
ably small.  The  central  interest  in  the  area  continues  to  be 
with  parameterization  of  the  decision  maker  and  his  environment 
and  with  generation  of  specific  aids  to  the  decision  process. 

This  review  is  not  limited,  therefore,  to  studies  that  have 
focused  specifically  on  the  issue  of  decision  training.  VJe  have 
attempted  instead  to  look  at  a rather  broad  cross  section  of  the 
general  decision-making  research  literature  with  a view  to  finding, 
wherever  we  could,  implications  for  the  traininq  of  decision  makers 
and  clues  concerning  what  further  research  might  lead  to  more 
effective  training  procedures  or  programs. 
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SECTION  II 

SOME  COMMENTS  ON  DECISION  THEORY 

One  can  distinguish  two  rather  different  approaches  that 
have  been  taken  to  the  study  of  decision  making.  One  is  analyti- 
cal; the  other  is  basically  empirical.  A common  goal  of  both 
approaches,  however,  is  the  development  of  formal  models  of 
decision  processes.  In  the  first  case,  one  tries  to  analyze 
decision  situations — often  hypothetical  situations — abstracting 
from  them  their  common  elements.  One  then  attempts  to  produce 
a model  of  the  decision-making  process,  using  the  constructs  that 
have  been  identified  in  the  process  of  analysis.  In  the  empirical 
approach,  one  begins  by  observing  individuals  making  decisions  in 
real-life  situations,  and  attempts,  on  the  basis  of  these  obser- 
vations, to  develop  parsimonious  descriptions  of  decision-making 
behavior . 

Each  approach  has  its  strengths  and  its  weaknesses.  The 
models  generated  by  analysis  are  likely  to  be  more  abstract  than 
those  developed  through  observation.  As  a consequence,  they  are 
typically  more  general.  However,  there  may  be  considerable  dif- 
ficulty in  applying  such  models  in  specific  cases.  This  is  true 
because  real-life  decision  situations  frequently  are  not  easily 
describable  in  terms  that  an  application  of  a model  would  require. 

In  contrast,  a model  of  a decision-making  process  that  is  developed 
by  observing  decision  makers  in  action  is  likely  to  be  applicable, 
at  least  to  situations  highly  similar  to  that  from  which  the  model 
is  derived.  Such  models  may  lack  generality,  however,  and  prove 
to  be  inapplicable  outside  the  context  in  which  they  are  developed. 

2 . 1 Prescriptive  versus  Descriptive  Models 

A prescriptive  model  indicates  what  one  should  do  in  a given 
decision  situation;  a descriptive  model  is  intended  to  describe 
what  one  actually  does.  Typically,  prescriptive  models  are  the 
outcomes  of  analytical  approaches  to  the  study  of  decision  making, 
whereas  empirical  approaches  generally  lead  to  descriptive  models. 

In  theory  at  least,  a prescriptive  model  ma  be  used  either  as  a 
guide  for  decision  makers  or  as  a standard  gainst  which  to  assess 
the  extent  to  which  decision-making  perform  mce  approaches  opti- 
mality. Descriptive  models  differ  from  prescriptive  models  inso- 
far as  human  decision  makers  perform  in  a less  than  optimal  fash- 
ions. Were  a decision  maker  to  behave  in  an  optimal  fashion,  a 
description  of  his  behavior  would  constitute  a prescriptive  model. 
Comparisons  between  prescriptive  and  descriptive  models  can  be 
instructive  in  suggesting  the  reasons  why  human  behavior  is  some- 
times not  optimal. 

Prescriptive  models  are  generally  associated  with  economists 
and  mathematical  statisticians.  Among  the  developers  and  expositors 
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of  prescriptive  decision  theory  are  Bernoulli  (1738),  Neyman  and 
Pearson  (1933),  Sarauelson  (1947),  vonNeumann  and  Morqenstern 
(1947),  Wald  (1947,  1950),  Good  (1952),  Blackwell  and  Girshick 
(1954),  Savage  (1954),  Luce  and  Raiffa  (195"7),  and  Schlaifer  (1959). 
Such  models  typically  postulate  an  "economic,"  or  at  least  a 
"rational,"  man  who  behaves  in  a way  that  is  entirely  consistent 
with  his  decision  objectives  and  who  does  not  have  some  of  the 
limitations  of  real  people. 

Descriptive  models  were  introduced  primarily  by  psychologists 
and  other  students  of  human  behavior,  notably  Edwards  (1954,  1961); 
Peterson,  Birdsall,  and  Fox  (1954);  Thrall,  Coombs,  and  Davis 
(1954);  Simon  (1954,  1955);  Tanner  (1956);  Davidson,  Suppes,  and 
Siegel  (1957);  Festinger  (1957);  Luce  (1959);  Siegel  (1959); 
Rapoport  (1960);  Estes  (1961);  and  Edwards,  Lindman,  and  Savage 
(1963) . The  objective  in  this  case  has  been  to  discover  by 
experiment  and  observation  how  human  beings,  given  their  limita- 
tions, perform  in  decision-making  situations.  It  is  important 
to  note  that  descriptive  models  have  been  viewed  as  descriptive 
only  of  the  behavior  of  the  decision  maker,  and  not  necessarily 
of  the  thinking  that  leads  to  that  behavior.  For  example,  the 
finding  that  an  individual's  choice  between  two  gambles  can  be 
predicted  on  the  basis  of  which  has  the  most  favorable  "expected 
outcome"  is  not  taken  as  evidence  that  in  making  the  choice  the 
individual  actually  goes  through  the  process  of  calculating 
expected  values  and  picking  the  alternative  with  the  largest  one 
(Edwards,  1955?  Ellsberg,  1961). 

The  two  lines  of  developmerit:--prescriptive  and  descriptive 
models — have  not  proceeded  independently  of  each  other.  Several 
of  the  investigators  mentioned  above  have  made  significant  con- 
tributions of  both  prescriptive  and  descriptive  types.  Moreover, 
one  approach  that  has  been  taken  to  the  study  of  human  limitations 
is  that  of  attempting  to  modify  prescriptive  models  so  that  they 
are  in  fact  more  descriptive.  Typically,  what  this  involves  is 
the  imposition  of  constraints  on  the  model  that  represent  specific 
limitations  of  the  human.  For  example,  a prescriptive  model  that 
assumes  an  infallible  memory  of  unlimited  capacity  is  unlikely 
to  be  very  descriptive  of  human  behavior;  to  modify  such  a model 
for  the  purpose  of  increasing  its  descriptiveness  would  necessi- 
tate at  least  the  addition  of  some  constraints  that  represent 
such  factors  as  a limitation  on  memory  capacity  and  degradation 
of  stored  information  over  time. 

The  distinction  between  prescriptive  and  descriptive  models 
is  sometimes  blurred  in  the  literature  and  one  cannot  always  be 
sure  in  which  way  a proponent  of  a model  intends  for  it  to  be 
taken.  On  the  other  hand,  many  writers  have  observed  that  the 
models  deriving  from  theories  of  economics  do,  in  fact,  fail  to 
describe  behavior,  or  at  least  to  do  so  very  accurately,.  Miller 
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and  Starr  (1967)  point  out,  for  example,  that  in  the  economist's 
view  of  decision  making,  the  objective  of  the  decision  maker  is  to 
maximize  the  "utility"  that  he  can  achieve  within  the  limitations 
of  his  resources.  They  note,  however,  that  the  assumption  that 
individuals  do  act  so  as  to  maximize  utility  has  been  challenged 
by  many  investigators  of  decision  making.  If  rationality  is 
defined  in  terms  of  the  extent  to  which  behavior  is  appropriate 
to  the  maximization  of  utility,  they  note,  then  when  people  do 
not  maximize  utility,  they,  by  definition,  are  acting  irrationally. 
Miller  and  Starr  list  several  factors  that  have  been  suggested  as 
possible  reasons  for  the  failure  of  decision  makers  to  behave  in 
an  optimal  way:  "the  inability  of  the  individual  to  duplicate  the 
rather  recondite  mathematics  which  economists  have  used  to  solve 
the  problem  of  maximization  of  utility;  the  existence  of  other 
values  which,  though  not  readily  quantifiable,  do  cause  divergences 
from  the  maximization  of  utility  in  the  marketplace;  the  effect  of 
habit;  the  influence  of  social  emulation;  the  effect  of  social 
institutions"  (p.  25). 

While  interest  in  prescriptive  models  stems  at  least  in 
part  from  the  assumption  that  they  can  provide  guidance  for 
decision  makers  in  real-life  situations,  their  application  often 
proves  to  be  less  than  straightforward.  Haythorn  (1961)  notes 
the  difficulty  that  operations  analysts  and  operations  researchers 
often  encounter  in  trying  to  analyze  decision  situations  in  com- 
plex organizations  to  the  point  that  prescriptive  models  can  be 
applied.  He  ascribes  the  difficulty  to  several  factors:  "First 
is  the  fact  that  organizations  are  constructed  by  men  with  some 
purposes  in  mind,  although  these  are  not  usually  stated  very 
explicitly.  Analytic  solutions  must  assume  that  the  decision 
maker  is  rational,  that  the  parameters  relevant  to  the  decision 
are  quantifiable,  and  that  the  information  necessary  to  make  an 
optimum  decision  is  available.  A careful  look  at  the  view  of 
the  world  held  by  critical  decision  makers  reveals  that  they  are 
by  no  means  completely  rational;  that  some  of  their  objectives 
are  not  easily  quantifiable,  and  perhaps  even  incompatible  with 
other  objectives;  that  they  do  not  have  all  of  the  information 
needed  in  many  cases;  and  that  frequently  the  information  they 
have  is  inaccurate"  (p.  23). 

Schrenk  (1969)  has  argued  that  progress  on  the  development 
of  techniques  for  aiding  decision  makers  will  be  impeded  until 
a model  of  "optimum"  decision  processes  that  makes  realistic 
assumptions  about  human  capabilities  is  forthcoming.  Such  a 
model,  Schrenk  suggests,  should  reflect  the  behavior  of  "reasoning 
man,"  a concept  that  he  distinguishes  from  the  rational  man  of 
economic  decision  theory.  "The  idea  is  not  to  specify  an  ’ideal’ 
decision  procedure  which  will  produce  perfect  choices  in  abstract 
or  laboratory  situations,  but  rather  to  develop  a process  that 
will  yield  better  decisions  in  real  situations"  (p.  548).  Schrenk 
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sees  four  purposes  that  such  a model  miqht  serve:  (1)  it  could 

provide  a framework  for  the  classification  and  integration  of  the 
results  of  decision-makinq  research;  (2)  it  could  provide  guidance 
or  further  research;  (3)  it  could  help  system  designers  to 
structure  decision  tasks  and  to  allocate  decision  functions  to 
men  and  machines;  and  (4)  it  could  help  guide  the  development  of 
decision-aiding  concepts. 

2 . 2 Worth,  Probability  and  Expectation 

Sometimes  a decision  maker  has  the  task  of  choosing  one  from 
among  several  alternative  courses  of  action,  knowing  what  the 
effect  of  any  choice  would  be.  (This  situation,  which  is  referred 
to  as  decision  making  under  certainty,  is  discussed  in  Section  IX.) 
Often,  however,  one  must  make  a choice  when  the  consequences  of 
that  choic  > cannot  be  anticipated  with  certainty.  In  the  latter 
situation,  the  decision  maker  is  said  to  be  making  a decision 
"under  risk.”  The  most  common  way  of  dealing  with  risky  decisions 
quantitatively  has  been  with  models  that  make  use  of  the  concept 
of  mathematical  expectation. 

The  "expectation"  associated  with  a choice  is  calculated  by 
obtaining  the  product  of  some  measure  of  worth  of  each  outcome 
and  a measure  of  the  probability  of  that  outcome,  and  summing  over 
all  outcomes  that  could  result  from  the  choice  of  interest.  It 
has  sometimes  been  assumed  that  the  decision  maker  attempts  to 
make  a choice  that  maximizes  his  "expected"  gain.  More  precisely, 
it  is  assumed  that  the  decision  maker  behaves  as  though  he  calcu- 
lated for  each  action  alternative,  the  sum  of  the  products  of  the 
worths  and  probabilities  of  the  possible  outcomes  associated  with 
that  alternative,  and  picked  the  alternative  for  which  this  sum 
was  greatest.  The  "as  though"  in  the  preceding  statement  is 
important.  No  one  contends  that  decision  makers,  as  a rule,  really 
perform  the  arithmetic  necessary  to  compute  expectation;  it  is 
only  suggested  th  t choices  are  made  as  though  they  were  based  on 
such  calculations. 

Each  ol  the  factors  in  the  expectation  equation — worth  and 
probability — can  be  treated  as  either  an  objective  or  a subjective 
variable.  The  four  possible  combinations  of  objective  and  sub- 
jective indicants  of  worth  combined  with  objective  and  subjective 
measures  of  probability  define  four  classes  of  expectation  models 
that  have  been  studied.  Table  1 gives  expressions,  in  the  nota- 
tion used  by  Coombs,  Bezembinder,  and  Coode  (19G7),  for  expecta- 
tions representing  each  of  these  models.  Much  of  the  research  on 
decision  making  under  risk  has  been  concerned  with  deteri. lining 
which  of  these  models  is  most  descriptive  of  human  behavior,  and 
with  develop inq  techniques  for  measuring  subjective  worths  and 
probabilities . 
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TABLE  1. 

FOUR  BASIC  TYPES 

OF  EXPECTATION 

MODELS • 

Model 

Type  of 

worth 

measure 

Type  of 
probability 
meaBur e 

Expectation 
associated 
with  jch 
possible 
outcome 

Expectation 

associated 

with 

choice  (which 
has  n possible 
■'u  tc  ones) 

Expect  ed 
valu  e 


objective  objective  pj  v^ 


1 Pj  Vj 
1-1  2 2 


Expected 

utility 


subjective  objective  p^ 


1 P1  U1 

J-l  2 2 


Subjectively  objective  subjective  1 1),  v. 

expected  3 3 

value 


n 


j-l 


j J 


Subjectively  subjective  subjective  iK  u. 

expected  2 2 

utility 


E *3  Uj 
j-l  3 3 


P, 

4>. 


an  objective  probability 
a subjective  probability 

an  objective  measure  of  value  (e,g.  amount  of  money) 
a subjective  measure  of  worth  (or  utility) 
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The  first  of  the  models  listed  in  Table  1,  the  Expected  Value 
model,  is  the  least  complex  conceptually,  and  the  most  easily  ap- 
plied, inasmuch  as  both  of  its  parameters  are  objectively  defined. 
Although  this  model  has  some  appeal  as  a prescriptive  model,  it 
has  proved  not  to  be  generally  descriptive  of  how  real  decision 
makers  behave  (see,  for  example,  Coombs,  Dawes,  & Tversky,  1970; 
Edwards,  1961;  Lichtenstein  & Slovic,  1971;  Lichtenstein,  Slovic, 

S.  Zink,  1969)  . 

The  inadequacy  of  the  Expected  Value  model  as  a descriptive 
model  is  clearly  illustrated  by  the  well-known  St.  Petersburg 
paradox.  Suppose  one  were  offered  an  opportunity  to  purchase  the 
following  gamble.  A fair  coin  is  to  be  tossed  until  it  comes  up 
tails,  at  which  time  the  coin  tossing  is  terminated  and  the  winnings 
are  collected.  If  the  coin  comes  up  heads  on  the  first  toss,  the 
purchaser  will  receive  $2.00;  if  it  comes  up  heads  on  both  the 
first  and  second  toss,  he  will  receive  $6.00  (or  $2.00  for  the 
first  toss  and  $4.00  for  the  second).  More  generally,  if  it  comes 
up  heads  for  n consecutive  tosses,  he  will  receive  $2.00  for  the 
first  toss,  $4.00  for  the  second,  $0.00  for  the  third,...  and  $2k 
for  the  kth,  for  a total  of 

n k 

E 2*  dollars. 
k=l 

Since,  by  definition,  the  successive  tosses  are  independent,  the 
expected  value  of  this  gamble  in  dollars  is  given  by 

EV  = i • 2 + 1 * O ...  + - 2n+...  =1+1+1+... 

2 4 2n 

which  is  to  say,  it  is  infinite.  If  one  were  attempting  to  maximize 
expected  value,  therefore,  one  should  be  willing  to  pay  a large 
amount  of  money  indeed  to  play  this  game.  It  would  be  surprising, 
however,  if  many  people  could  be  found  who  would  be  willinq  to 
risk  their  life  savings,  say,  which  would  be  small  by  comparison 
with  the  expected  gain,  to  purchase  this  ciamble.  In  general,  it 
is  clear  that  the  attractiveness  of  a gamble  depends  not  only  on 
the  expected  value  of  the  outcome  but  on  such  factors  as  the  amount 
that  one  could  possibly  lose,  and  the  nature  of  the  distribution 
of  probabilities  over  the  possible  outcomes.  In  the  gamble  de- 
scribed above,  for  example,  the  probability  is  .5  that  the  purchaser 
will  win  nothing,  and  .75  that  ho  will  win  at  most  $2.00. 

In  spite  of  the  inadequacy  of  the  Expected  Value  model  as  a 
generally  valid  description  of  behavior,  it  should  be  noted  that 
the  model  does  a creditably  good  job  of  describing  behavior,  at 
least  grossly,  in  many  decision  situations.  Even  in  the  case  of 
gambling  behavior,  it  does  not  invariably  fail;  "about  88¥.  of  the 
job"  of  explaining  the  behavior  of  the  Las  Veqas  gamblers  studied 
by  Edwards,  for  example,  could  be  done  on  the  basis  of  a knowledge 
of  the  expected  value  of  each  bet  (Rapoport  & Wallsten,  1972). 
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Implicit  to  the  Expected  Value  model  is  the  assumption  that 
the  monetary  value  of  a decision  outcome  represents  its  real  worth 
to  the  decision  maker,  and  that  this  worth  is  the  same  for  all 
individuals.  Recognition  that  such  an  assumption  is  undoubtedly 
false  led  to  the  formulation  of  the  Expected  Utility  model  in 
which  monetary  value  is  replaced  by  a measure  of  the  "utility"  of 
an  outcome  for  the  particular  decision  maker  involved.  According 
to  this  formulation  the  same  decision  outcome  may  appeal  to  dif- 
ferent individuals  to  different  degrees,  and,  consequently,  prefer- 
ences among  decision  alternatives  with  uncertain  outcomes  may 
differ  from  one  decision  maker  to  another.  The  Expected  Utility 
model  was  first  proposed  by  Bernoulli  (1738)  and  oiven  its  modern 
axiomatic  form  by  von  Neumann  and  Morgenstern  (1947). 

Given  that  the  worth  factor  in  the  expectation  equation  is 
defined  as  a subjective  variable,  the  question  arises  concerning 
how  probability  should  be  defined.  Although  a review  o'  the  con- 
troversy would  take  us  too  far  afield,  it  should  be  noted  that  the 
question  of  what  the  concept  of  probability  "really  means"  has 
been  the  subject  of  endless  philosophical  debate.  It  is  sufficient 
for  our  purposes  to  recognize  that  statements  of  the  type  "the 
probability  of  the  occurrence  of  event  X is  equal  to  Y"  have  been 
used  in  a variety  of  ways.  Such  a statement  is  sometimes  used  to 
refer  to  the  relative  frequency  with  which  X has  been  observed 
over  the  course  of  many  similar  situations.  Or  it  can  have  refer- 
ence to  a ratio  in  which  the  numerator  represents  the  total  number 
of  ways  in  which  the  outcome  of  an  hypothetical  experiment  can 
satisfy  some  criterion  and  the  denominator  represents  the  total 
number  of  different  outcomes  (as  when  one  says  the  proba- 
bility of  rolling  a 2 or  less  on  a fair  die  is  2/6).*  Sometimes 
a probability  statement  is  used  to  refer  to  the  strenqth  of  one's 
confidence,  or  the  degree  of  one's  belief,  that  an  event  X,  as 
opposed  to  the  other  events  that  are  considered  possibilities, 
will  occur.  It  is  this  connotation  that  we  here  refer  to  as 
"subjective  probability." 

In  some  situations  it  makes  little  if  any  practical  difference 
which  of  these  connotations  one  gives  to  the  cone  ?t  of  probability, 


♦Related  to  this  usage  of  the  term  is  the  so-called  "Principle  of 
Insufficient  Reason which  directs  the  decision  maker  to  consider 
all  possible  outcomes  to  be  of  equal  likelihood  in  the  absence  of 
information  which  indicates  such  a consideration  to  be  inappro- 
priate. See  Rapoport  (1964)  for  an  interesting  discussion  of  the 
limitations  of  this  prescription  in  defense  of  an  assertion  that 
the  six  faces  of  a die  are  equally  likely  when  one  has  no  reason 
to  assert  otherwise. 
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inasmuch  as  they  will  all  yield  the  same  numbers.  Most  people 
would  perhaps  agree,  for  example,  that  the  probability  of  tossing 
heads  on  a fair  coin  is  .5,  irrespective  of  their  philosophical 
position  concerning  how  probability  should  be  defined.  Many 
"probabilistic"  situations  of  interest  to  investigators  of  deci- 
sion making  do  not  easily  admit  of  an  analysis  in  terms  of  rela- 
tive frequencies,  or  even  of  theoretical  ratios,  however,  and  it 
is  perhaps  for  this  reason  that  many  decision  theorists  subscribe 
to  the  notion  that  probability  is  best  defined  in  terms  of  degree 
of  belief.  Rapoport  (1964)  defends  this  position  the  following  way. 
"We  a told  that  decisions  involving  the  probability  of  the  out- 
break jf  a nuclear  war  are  based  on  'calculated  risks, ' by  which 
term  those  who  recommend  or  make  decisions  must  imply  calculations 
involving  probabilities.  Since  the  probability  of  an  event  such 
as  the  outbreak  of  a nuclear  war  can  have  nothing  to  do  with  the 
frequency  of  such  events  (since  at  this  writing  none  has  occurred, 
and,  in  all  likelihood,  no  more  than  very  few  can  occur),  either 
the  phrase  'the  probability  of  a nuclear  war'  has  no  meaning  at 
all,  in  which  case  the  notion  of  the  'calculated  risk'  is  only 
eyewash,  or  else  'probability'  has  another  meaninq,  having  nothing 
whatsoever  to  do  with  frequency"  (p.  25). 

The  argument  that  probability  often  cannot  be  defined  mean- 
ingfully in  terms  of  relative  frequencies  or  ratios  is  a strong 
one  for  resorting  to  a definition  in  terms  of  subjective  uncer- 
tainty. Even  when  an  objective  definition  is  easy  to  come  by, 
however,  one  may  question  whether  it  should  be  used  by  any  theory 
that  purports  to  be  descriptive  of  the  behavior  of  real  decision 
makers.  It  is  the  decision  maker's  own  expectation  that  is  pre- 
sumably important  in  determining  his  behavior  and  his  expectation 
must  be  calculated  in  terms  of  the  probabilities  as  he  perceives 
them.  Moreover,  it  is  required  of  a rational  man  thatnis  behavior 
be  consistent  with  the  information  at  his  disposal,  but  not  that 
he  have  perfectly  accurate  information.  Thus,  two  decision  makers 
could  behave  optimally,  but  quite  differently,  in  the  same  situa- 
tion if  their  perceptions  of  the  situation  differed,  a fact  that 
is  easy  to  accommodate  when  probability  is  defined  as  degree  of 
belief  but  not  when  it  is  defined  strictly  in  terms  of  the  ob- 
jective details  of  the  situation. 

In  the  foregoing  discussion  of  Expected  Value  and  Expected 
Utility  models,  it  was  tacitly  assumed  that  the  probability  factor 
in  the  expectation  equation  was  objectively  defined.  As  suggested 
by  Table  1,  two  additional  types  of  expectation  models  might  be 
realized  by  combining  subjective  probabilities  with  both  objective 
and  subjective  measures  of  worth.  The  resulting  models  might  be 
referred  to,  respectively,  as  Subjectively  Expected  Value  and  Sub- 
jectively Expected  Utility  models.  Although  both  of  these  types  of 
models  have  been  considered,  the  latter  is  by  far  the  more  widely  ac- 
cepted and  used.  This  model  has  been  presented  by  Savage  (1954)  and 
by  Edwards  (1955).  Among  the  four  models  listed  in  Table  1 which 


NAVTRAEQU IPCEN  73-C-0128-1 


have  been  referred  to  as  single-stage  algebraic  decision  models — 
it  has  received  the  greatest  amount  of  empirical  support,  and 
at  the  moment,  ranks  as  the  most  influential  (Rapoport  & Wallsten, 
1972)  . 


Savage's  (1954)  formulation  of  decision  theory  identifies  a 
number  of  "seemingly  agreeable"  (Tversky,  1969)  rules  that 
should  be  satisfied  before  it  is  appropriate  to  assign  a single 
fixed  number  denoting  worth  to  each  possible  decision  outcome  and 
a single  fixed  number  denoting  judged  likelihood  of  occurrence 
and  then  to  select  maximum  products.  These  rules  (see  Becker  S. 
McClintock,  1967)  are  as  follows: 

Rule  1:  Transitivity.  If,  in  a choice  situation,  the  de- 

cision maker  prefers  Outcome  A to  Outcome  B and  Outcome  B to 
Outcome  C,  he  should  prefer  Outcome  A to  Outcome  C. 

Rule  2:  Comparability.  The  decision  maker  should  be  willing 

to  compare  two  possible  outcomes  and  decide  either  that  he  prefers 
one  to  the  other  or  that  he  has  no  preference  between  them. 

Rule  3:  Dominance.  If  the  decision  maker  determines  that, 

under  every  possible  condition  a choice  of  one  of  his  alternative 
actions  results  in  an  outcome  at  least  as  desirable  as  that  which 
would  result  from  the  choice  of  a second  alternative  action,  and 
results  in  a itiore  desirable  outcome  under  at  least  one  possible 
condition  than  would  the  second  action,  the  second  action  should 
not  be  preferred  to  the  first. 

Rule  4:  Irrelevance  of  non  affected  outcomes.  If  the  de- 

cision  maker  determines  that,  for  a particular  state  of  the  world, 
two  or  more  of  the  actions  open  to  him  result  in  the  same  outcome, 
his  preferences  among  such  actions  should  not  be  affected  by  the 
outcome  associated  with  that  state. 

Rule  5:  Independence  of  beliefs  and  rewards.  The  decision 

maker's  statement  concerning  the  likeliliood  o?  occurrence  of  a 
given  outcome  should  not  be  affected  by  what  he  hopes  will  occur. 

Some  of  these  rules  seem  to  be  honored  as  much  in  the  breach 
as  in  the  observance  (see,  for  example,  MacCrimmon,  1968) . Vio- 
lations of  Rule  1 are  of  major  significance.  This  is  so  because 
the  assumption  of  transitivity  of  preferences  is  a necessary 
requirement  for  the  construction  of  a consistent  ordinal  utility 
function.  (For  a discussion  of  the  problem  of  generating  utility 
functions  from  preference  judgments,  see  Roberts,  1970.)  Tversky 
(1969)  refers  to  transitivity  as  "the  cornerstone"  of  decision 
theory  and  points  out  that  it  underlies  measurement  models  of 
sensation  and  value  as  well.  He  also  notes  that  decision  makers 
often  do  violate  the  transitivity  rule  in  specific  situations. 
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Another  rule  which  seems  difficult  to  satisfy  is  that  re- 
quiring independence  of  beliefs  and  rewards  (Rule  5) . MacCrimmon 
(1968)  has  found  a strong  dependency  between  an  individual's 
estimates  of  the  likelihoods  of  events  and  his  " tastes" --the 
worths  he  assigns  to  those  events.  As  might  be  anticipated,  such 
an  association  may  pose  difficult  analytic  problems,  since,  for 
a given  set  of  choices,  one  cannot  assume  that  a distribution  of 
(stated)  preferences  arises  simply  out  of  differences  associated 
with  but  one  of  the  two  parameters  in  the  expectation  equation. 

In  principle,  this  problem  is  similar  to  the  so-called  "conjoint 
measurement  problem"  which  has  received  major  attention  in  the 
context  of  Subjectively  Expected  Utility  theory. 

A variant  of  the  dominance  principle  (Rule  3)  has  been  stated 
by  MacCrimmon  so  as  to  apply  to  the  problem  of  comparing  alter- 
natives that  differ  with  respect  to  several  attributes  when 
preferences  can  be  stated  with  respect  to  single  attributes 
individually:  "When  comparing  all  alternatives,  if  some  alter- 

native has  higher  attribute  values  for  all  attributes,  we  say 
that  this  alternative  'dominates'  the  others.  We  can  weaken  this 
notion  somewhat  and  say  that  if  one  alternative  is  at  least  as 
good  as  the  other  alternatives  on  all  attributes,  and  is  actually 
better  on  at  least  one  of  them,  then  this  can  still  be  considered 
the  dominant  alternative.  Conversely,  if  one  alternative  is 
worse  than  some  other  alternative  for  at  least  one  attribute,  and 
is  no  better  than  equivalent  for  all  other  attributes,  then  we 
can  say  the  former  alternative  is  dominated  by  the  latter"  (p.  18). 
Some  writers  have  noted  that  the  dominance  criterion  is  inconsis- 
tent with  the  maximin  criterion  of  game  theory  (Marschak,  1950) 

Luce  s.  Raiffa,  1957).  Ellsberg  (1961)  has  discussed  additional 
problems  with  this  rule. 

Some  other  assumptions  that  have  usually  been  considered 
necessary  to  the  use  of  expectation  models  are  the  following:  (1) 

that  the  act  of  gambling  has  no  utility  itself?  (2)  that  the  sub- 
jective probabilities  associated  with  the  alternative  decision 
outcomes  sum  to  unity?  (3)  that  preferences  are  independent  of  the 
method  by  which  they  are  measured.  It  has  not  been  possible  to 
demonstrate  that  the  first  two  of  these  assumptions  are  simulta- 
neously valid.  Moreover,  Slovic  (1966)  and  others  (Lichtenstein 
& Slovic,  1971?  Lindman,  1970)  have  shown  that  preferences  among 
gambles  may  indeed  depend  in  part  on  the  method  by  which  they 
are  obtained  (e.g.,  a rating  procedure  versus  a bidding  procedure). 
In  spite  of  these  limitations,  expectation  models,  and  in  par- 
ticular the  Subjectively  Expected  Utility  model,  have  proven  to 
be  reasonably  predictive  of  at  least  certain  types  of  choice 
behavior  (Coombs,  Bezembinder,  & Goode,  1967).  They  clearly  do 
not,  however,  tell  the  whole  story  of  how  to  account  for  human 
choice  behavior. 
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The  demonstration  that  expectation  models  such  as  those  de- 
scribed are  unable  to  account  for  choice  behavior  consistently 
and  completely  has  led  some  theorists  to  seek  to  modify  (which  has 
invariably  meant  to  complicate)  the  models  to  make  them  more  de- 
scriptive. Other  theorists  have  simply  rejected  them  out  of  hand. 
Payne  (1971)  points  out  that  models  such  as  those  we  have  con- 
sidered involve  the  representation  of  risky  alternatives  as  proba- 
bility distributions  over  sets  of  decision  outcomes,  and  attribute 
the  choice  among  the  decision  alternatives  to  some  function  of  each 
distribution's  central  tendency.  In  the  hope  of  developing  models 
with  greater  predictive  power,  some  theorists  have  looked  not  only 
to  central  tendency  measures,  such  as  expected  or  mean  values, 
but  to  variances  and  higher  moments  of  these  distributions  as  well 
(Becker  & McClintock,  1967).  Still  others  have  made  modifications 
that  relax  the  requirement  that  the  decision  maker's  choice  be 
invariably  dictat  d by  which  of  his  alternatives  represents  the 
greatest  expectation?  "random  utility"  models  have  been  proposed, 
for  example,  which  assume  that  the  utility  of  a given  outcome  is 
a random  variable  and  that  variations  in  this  variable  produce 
variations  in  choice  (Becker,  DeGroot,  & Marschak,  1963). 

Shackle's  (1967)  assessment  of  expectation  models  is  represen- 
tative of  the  opinions  of  theorists  who  reject  such  models  out  of 
hand.  He  argues  that  the  concept  of  mathematical  expectation, 
and,  indeed,  the  concept  of  probability  as  well,  are  irrelevant 
to  the  assessment  of  one-of-a-kind  decision  situations.  Further- 
more, he  contends,  most  real-life  decision  situations  of  interest 
are,  to  those  who  face  them,  unique  events;  never  before  has  the 
individual  been  called  upon  to  make  exactly  the  choice  that  he 
faces  and  never  again  will  he  have  to  select  from  among  the  same 
set  of  action  alternatives  under  precisely  the  same  circumstances . 
In  such  cases,  Shackle  argues,  the  decision  maker  is  concerned 
with  what  can  happen  as  a result  of  his  choice,  not  with  what 
would  happen  if  the  experiment  were  repeated  a large  number  of 
times:  "he  is  concerned  with  possibility  and  not  probability" 

(p.  40) . We  should  note  that  the  argument  implies  a relative- 
frequency  connotation  of  probability,  a connotation  that  not  all 
decision  theorists  accept. 

Miller  and  Starr  (1969)  suggest  that  one  can  always  find  a 
way  to  view  a decision  problem  as  a maximization  problem  if  one 
wants  to  do  so:  the  quantity  that  the  decision  maker  wishes  to 

maximize  is  the  degree  of  attainment  of  his  objective.  But  this 
is  not  very  helpful  as  a definition:  indeed,  it  comes  close  to 
being  tautological.  Miller  and  Starr  apparently  do  not  intend 
to  assert  as  an  empirical  fact  that  decision  makers  do  attempt  to 
maximize  anything.  More  generally,  whether  decision  makers  attempt 
to  find  optimum  solutions  to  their  decision  problems  Miller  and 
Starr  consider  to  be  questionable.  Simon  (1955)  has  taken  the 
position  that  they  usually  do  not.  According  to  his  "principle 
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of  bounded  rationality"  what  they  do  instead  is  to  define  a 
limited  set  of  acceptable,  or  "good  enough,"  decision  outcomes 
and  then  select  a strategy  that  they  consider  to  be  likely  to 
achieve  one  of  these. 

The  current  status  of  expectation  models  among  investigators 
of  decision  making  is  reasonably  well  summarized  by  three  obser- 
vations. (1)  The  models  that  are  seriously  advocated  as  descrip- 
tive of  human  behavior  are  rather  more  complex  than  the  straight- 
forward Expected  Value  model  that  was  originally  proposed.  The 
history  of  the  development  of  expectation  models  may  be  fairly 
characterized  as  a progression  from  the  simple  to  the  more  complex: 
objectively  defined  variables  have  been  replaced  with  variables 
defined  in  subjective  terms,  and  the  number  of  model  parameters 
has  been  increased.  (2)  Even  the  most  complicated  models  have 
not  proven  to  be  totally  descriptive  of  behavior  and  some  theorists 
have  challenged  the  validity  of  the  basic  assumption  of  this  class 
of  models,  namely  that  the  decision  maker  is  motivated  to  maximize 
an  expectation,  no  matter  how  the  factors  from  which  expectation 
is  computed  are  defined.  (3)  Their  limitations  notwithstanding, 
expectation  models — even  the  least  sophisticated  Expected  Value 
model — do  a reasonably  good  job  of  predicting  choice  behavior  in 
many  situations.  The  challenge  is  tc  come  up  with  models  that 
can  handle  the  situations  for  which  these  models  fail,  as  well 
as  those  for  which  they  succeed.  Meanwhile,  when  the  maximization 
of  expectation  is  recognized  as  the  decision  objective,  thin 
expectation  models  can  be  used  prescr iptively  to  guide  the 
decision  process. 

2 . 3 Game  T! leory 

The  theory  of  games  was  developed  to  deal  with  situations 
in  which  the  outcomes  of  an  individual's  decisions  depend  not  only 
upon  his  own  actions  but  also  upon  those  of  one  or  more  "opponents" 
— decision  makers  whose  objectives  conflict  to  some  degree  with 
his  own.  Of  special  interest  is  the  so-called  "zero-sum"  situa- 
tion in  whj  the  worths  of  the  outcomes  to  the  opponents  sum  to 
zero;  one  loses  what  another  wins.  A commonly  prescribed  strategy 
for  each  "player"  of  a zero-sum  game  is  to  make  choices  in  such 
a way  as  to  minimize  his  maximum  possible  loss,  the  so-called 
minimax  rule. 

The  assumptions  of  game  theory  are  open  to  a number  of  criti- 
cisms. Shackle  ( 1967) , for  example,  characterizes  the  theory  of 
games,  as  developed  by  vonNeumann  and  Moryenstern,  as  "essentially 
a study  of  the  logic  of  how  to  present  as  impregnable  a front 
as  possible  to  an  infallibly  wise  and  rational  opponent"  (p.  61) . 
The  assumption  that  one  is  in  a conflict  and  that  one's  opponent 
is  rational  and  infallibly  wise  leads  directly  to  the  minimax 
doctrine.  Shackle  questions  to  what  extent  this  conceptualization 
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can  be  taken  as  a reasonable  approximation  to  reality.  "Is  the 
impersonal  world- of  nature  or  even  that  of  business  actively  con- 
cerned to  defeat  us?  Is  the  human  opponent  reasonably  assumed 
to  be  infallible?  Is  there  no  essential  and  ineradicable  uncer- 
tainty in  the  outcomes  of  such  few  big  experiments,  large  in  time 
scale  in  comparison  with  the  human  life-span,  that  any  of  us  has 
time  to  make?  Rather  than  minimax  our  losses,  is  it  not  more 
reasonable  to  fix  for  them  some  maximum  tolerable  numerical  size, 
to  avoid  any  action-scheme  which  would  bring  losses  larger  than 
this  within  the  range  of  possible  or  'too-possible'  outcomes,  and 
subject  to  this  constraint  to  choose  that  action-scheme  /hich 
brings  within  the  range  of  possible  or  'sufficiently  possible' 
outcomes,  as  high  a positive  success  as  we  can  find?"  {p.  65). 

In  a similar  vein,  Becker  and  McClintock  (1967)  question 
what  they  refer  to  as  game  theory's  "principle  psychological  assump- 
tions." They  point  out  that  the  theory  assumes,  on  the  one  hand, 
that  both  decision  makers  will  attempt  to  maximize  their  own 
utility  and,  on  the  other  hand,  will  attempt  to  minimize  their 
maximum  losses.  These  assumptions  are  inconsistent  unless  the 
decision  makers  look  at  the  game  from  each  other's  points  of 
view — a requirement  which  Morin  (1960)  finds  unsupportable  on 
empirical  grounds--and  unless  the  utilities  of  each  decision  maker 
are  known  to  the  other  and  sum  to  zero  for  each  possible  outcome. 

Despite  its  limitations,  game  theory  has  provided  a valuable 
framework  within  which  to  view  decision  making  in  such  fields  as 
economics,  political  science,  social  psychology  and  military 
strategy.  The  theory  has  been  extended  to  cover  non-zero-sum 
situations,  situations  permitting  cooperation  or  collaboration 
amonq  subsets  of  players  of  multi  person  games.  In  addition  to 
minimax,  other  strategies  have  been  identified  as  either  prescrip- 
tively  appropriate,  or  descriptive  of  behavior,  in  particular 
situations . 

A short  and  very  readable  exposition  of  the  basic  concepts 
of  game  theory  may  be  found  in  Edwards  (1954).  A comprehensive 
tutorial  treatment  is  provided  by  Luce  and  Raiffa  (1957). 

2 . 4 Decision  Theory  and  Training 

It  is  a reasonable  question  to  raise  whether  one  may  hope  to 
be  an  effective  decision  maker  in  a variety  of  situations  without 
some  intellectual  appreciation  for  cho  decision-making  process, 
as  it  is  represented  by  theoretical  treatments  of  decision  making. 
One  would  guess  that  there  would  be  some  advantage  to  being  famil- 
iar, at  least  with  certain  of  the  key  concepts  that  decision 
theorists  employ.  In  practice,  this  would  mean  providing  would-be 
decision  makers  with  a basic  introduction  to  probability  theory 
as  well  as  a working  familia  ity  with  notions  of  rationality, 
value,  utility,  mathematical  expectation,  risk,  risk  preferences, 
and  so  on. 
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In  fact,  one  could  make  the  case  that  failure  to  provide  an 
adequate  grounding  in  theory  might  deprive  the  decision  maker  of 
the  sorts  of  insights  that  would  load  to  productive  use  of  avail- 
able decision-aiding  techniques.  The  demonstration  by  MacCrimmon 
(1968)  that  decision  aids  developed  in  quite  disparate  contexts 
can  be  effectively  brought  together  in  the  solution  of  problems 
involving  multi-attribute  alternatives,  suggests  the  utility  of 
broad  acquaintance  with  basic  concepts  and  principles. 

In  reporting  one  effort  to  develop  a system  to  assist  cor- 
porate decision  makers  by  enabling  them  to  manipulate  parameters 
(entered  as  distribution  functions)  on  preprogrammed  tree  models, 
Beville,  Wagner,  and  Zanatos  (1970)  made  some  observations  that 
are  relevant  to  this  point.  They  noted  that  the  use  of  subjective 
probability  distributions  as  inputs  to  models  is  novel  even  to 
experienced  decision  makers,  and  must  be  carefully  taught.  More 
generally,  they  concluded  that  a black-box  approach  to  utilization 
of  the  system  would  have  been  markedly  inferior  to  one  in  which 
the  workings  of  the  system  were  explained  to  the  user. 

The  teaching  of  decision  theory  should,  of  course,  distin- 
guish what  is  intended  to  be  prescriptive  from  what  is  considered 
descriptive  of  the  behavior  of  human  decision  makers.  It  should 
also  clearly  identify  the  limitations  of  the  models  that  are 
considered.  Tutorial  treatments  of  decision  theory  and  game 
theory  are  readily  available  sources  of  training  material  (Edwards, 
1954;  Edwards  & Tversky,  1967;  Howard,  1968;  Lee,  1971;  Luce 
5.  Raiffa,  1957;  Miller  & Starr,  1967;  North,  1968;  Rapoport,  1960; 
Schlaifer,  1969) . A comprehensive  bibliography  of  research  reports 
has  been  prepared  by  Edwards  (1969). 

Whether  familiarization  with  theoretical  treatments  of  de- 
cision making  will  in  fact  improve  decision-making  behavior  is  a 
question  for  empirical  research.  Our  guess  is  that  the  answer 
will  be  a qualified  yes.  Such  t.ra  ; ning  will  be  efficacious  for 
some  people  performing  certain  typos  of  decision  tasks  but  perhaps 
not  for  all  people  or  all  tasks.  One  objective  of  training  research 
should  be  to  identify  those  conditions  under  which  such  training 
would  be  effective  and  those  under  which  it  would  be  a waste  of 
time. 
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SECTION  III 

CONCEPTUALIZATIONS  OF  DECISION  SITUATIONS  AND  TASKS 

Numerous  ways  of  conceptualizing  decision  processes  have  been 
proposed  by  different  investigators.  Some  conceptualizations 
emphasize  differences  among  decision  situations,"  others  focus  on 
the  tasks  that  decision  makers  are  required  to  perform.  All  of 
them  have  the  same  purpose,  namely  that  of  simplifying  the  problem 
of  thinking  about  decision  making  by  identifying  a few  "types," 
each  of  which  is  representative,  in  terms  of  some  critical  aspects, 
of  a variety  of  specific  situations  or  tasks.  We  review  briefly  in 
this  section  a number  of  proposed  simplifying  conceptualizations. 
There  is  no  attempt  to  be  exhaustive.  The  intent  is  simply  to 
illustrate  by  means  of  a few  examples  some  of  the  ways  in  which 
investigators  of  decision  making  have  characterized  or  categorized 
the  object  of  their  study. 

3.1  Classifications  of  Decision  Situations  or  Decision  Types 

3.1.1  Edwards 

hdwards  (1967)  makes  a distinction  between  static  and  dynamic 
decision  situations.  In  the  former  case,  a one-time  decision  is 
required,,  whereas  in  the  latter,  sequences  of  decisions  are  made, 
earJier  decisions  and  their  outcomes  having  implications  for  sub- 
sequent ones.  Six  types  of  dynamic  decision  situations  are 
distinguished  on  the  basis  of  such  factors  as  whether  the  environ- 
ment is  stationary  or  nonstationary,  whether  or  not  the  environment 
.is  affected  by  the  decisions  that  are  made,  and  whether  or  not  Lhe 
information  about  the  environment  is  affected  or  controlled  by 
those  decisions.  Edwards  further  classifies  psychological  research 
relating  to  decision  making  under  four  topics:  information 

seeking,  intuitive  statistics,  sequential  prediction,  and  Bayesian 
processing. 

3.1.2  Howard 

Howard  (1968)  characterizes  decision  situations  in  terms  of 
i hree  orthogonal  dimensions:  degree  of  uncertainty,  degree  of 

complexity  (number  of  relevant  variables) , and  degree  of  time 
dependence.  The  various  combinations  ot  the  extreme  values  on 
these  dimensions  are  taken  as  represent  ttive  of  eight  prototypical 
situations,  for  each  of  which  there  is  an  appropriate  set  of 
analytical  tools.  An  example  of  a deterministic  (no  uncertainty), 
single  variable,  static  (time-independent)  problem  would  be  to 
determine  the  largest  rectangular  area  that  can  be  enclosed  with 
a fixed  amount  of  fencing.  The  appropriate  mathematical  tool  would 
be  the  calculus.  Decision  problems  like  assigning  customers  to 
warehouses  or  jobs  to  men  would,  in  Howard's  taxonomy,  be  in  the 
category  defined  as  deterministic,  complex  (many  variables),  and 
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static.  Matrix  algebra  and  linear  optimization  are  appropriate 
mathematical  techniques. 

3.1.3  Sidorsky 

Sidorsky  and  his  colleagues  have  proposed  a taxonomy  of  types 
of  decisions  encountered  in  tactical  military  situations  (Sidorsky, 
Houseman,  & Ferguson,  1964;  Sidorsky  & Simoneau,  1970;  Hammell  U 
Mara,  1170) . The  acronym  ACADIA  is  used  as  a mnemonic  for  the  six 
types  of  "situational  demands"  identified  by  the  taxonomy: 
Acceptance,  Change,  Anticipation,  Designation,  Implementation,  and 
Adaptation. 

An  acceptance-type  decision  has  to  do  with  applying  data  to 
the  acceptance  or  rejection  of  a hypothesis  concerning  some  char- 
acteristic of  the  enemy.  Detection,  classif ication  and  localiza- 
tion are  associated  operations  or  objectives.  The  acceptance- 
decision  idea  seems  to  be  close  to  what  some  other  investigators 
have  referred  to  as  situation  diagnosis.  A change-type  decision 
involves  the  decision  maker  in  a choice  between  initiating  a new 
tactical  operation  or  continuing  the  course  of  action  on  which  he 
is  already  launched.  An  anticipation- type  decision  is  required 
when  a decision  maker  must  predict  what  the  state  or  intention  of 
an  enemy  force  will  be  sometime  in  the  future. 

A designation-type  decision  involves  the  choice  of  one  from 
among  a set  of  possible  action  alternatives.  An  implementation- 
type  decision  has  to  do,  not  with  the  selection  of  an  action 
alternative,  but  with  the  determination  of  the  proper  time  to 
execute  it.  An  adaptation- type  decision  is  called  for  when  the 
decision  maker  is  faced  suddenly  wi th  unexpected  and  perhaps 
potentially  disastrous  circumstances. 

3.2  Classifications  of  Decision  Tasks 


3.2.1  Howard 

Howard  conceives  of  the  decision  process  as  being  composed 
of  three  phases:  (1)  the  deterministic  phase,  (2)  the  proba- 

bilistic phase,  and  (3)  the  information  phase.  In  the  deterministic 
phase,  the  decision  analyst  identifies  the  state  and  decision 
variables  and  constructs  a model  of  the  decision  problem.  In  the 
probabilistic  phase,  he  assigns  probability  distributions  on  the 
state  variables.  In  the  information  phase,  he  determines  what 
additional  information  should  be  gathered  to  reduce  uncertainty 
further.  Howard  estimates  that  the  first  phase  represents  about 
60%  of  the  total  effort  of  the  decision  maker,  while  the  second 
and  third  phases  represent  about  25%  and  15%,  respectively. 
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3.2.2  Adelson 

A taxonomy  of  decision  tasks  that  are  carried  out  in  modern 
military  command-and-control  systems  is  proposed  by  Adelson  (1961) . 
Four  types  of  tasks  are  distinguished:  (1)  characterization  of  the 

stale  of  the  world,  (2)  determination  of  the  available  action 
alternatives,  (3)  outcome  prediction  and  (4)  choice  rationalization. 
The  first  task  type  refers  to  the  need  of  the  decision  maker  to 
characterize  the  current  state  of  the  world  in  a way  that  is 
relevant  to  his  decision  problem.  The  definition  of  the  variables 
in  terms  of  which  the  characterization  should  made,  and  the 
assessment  of  the  relative  stability  of  the  world  that  is  being 
observed  are  seen  as  significant  problems.  The  second  task  type 
acknowledges  the  need  to  make  explicit  the  courses  of  action  that 
are  open  to  the  decision  maker.  The  difficulty  of  this  task  may 
depend  somewhat  on  how  rapidly  the  situation  is  changing  and  on  the 
cc st  of  obtaining  information.  Outcome  prediction  refets  to  the 
process  of  attempting  to  anticipate  what  the  consequences  would  be 
if  specific  action  alternatives  were  selected.  The  final  task  type 
involves  the  need  to  justify  one's  choice  of  action  in  terms  of  the 
objectives  of  the  command-and-control  system. 

3.2.3  Drucker 

Drucker  (1967)  has  identified  six  steps  that  he  considers  to 
be  involved  in  the  process  of  making  the  types  of  decisions  that 
confront  business  executives:  (1)  the  classification  of  the  problem, 
(2)  the  definition  of  the  problem,  (3)  the  specifications  which  the 
answer  to  the  problem  must  satisfy,  (4)  the  decision  as  to  what  is 
"right  (as  distinguished  from  what  is  acceptable  in  order  to  meet 
the  boundary  conditions),  (5)  the  building  into  the  decision  of  the 
action  to  carry  it  out,  and  (6)  the  feedback  which  tests  the 
validity  and  effectiveness  of  the  decision  against  the  actual  course 
of  events. 

3.2.4  Soelberg 

Soelberg's  (1966)  taxonomy,  like  Drucker 1 s identifies  six 
aspects  of  the  decision  making  process:  (1)  problem  recognition, 

(2)  problem  definition,  (3)  planning,  (4)  search,  (5),  confirmation 
and  (6)  implementation. 

3.2.5  Hill  and  Martin 

A model  proposed  by  Hill  and  Martin  (1971)  also  recognizes 
six  different  categories  of  activities  in  the  decision-making 
process:  (X)  identification  of  concern,  (2)  diagnosis  of  situation, 

(3)  formulation  of  action  alternatives,  (4)  test  of  feasibility  of 
selected  alternatives,  (5)  adoption  of  alternative,  and  (6)  assess- 
ment of  consequences  of  adopted  altern> rive . The  model  assumes  that 
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the  decision  maker's  behavior  at  each  of  these  steps  is  influenced 
by  what  he  knows  of  the  theory  and  practice  of  decision  making  as 
well  as  by  what  he  knows  about  the  setting  in  which  the  decision 
problem  exists.  Hill  and  Martin  identify  nineteen  skills  that 
they  consider  to  be  implicit  in  these  six  generic  activity 
categories : 

"1.  Asking  for  and  receiving  feedback 

2.  Assembling  the  facts  (including  past  experience  as  it  bears 
on  the  decision) 

3.  Identifying  the  courses  of  action  available 

4.  Identifying  forces  for  and  against  the  alternatives 

5.  Ranking  and  rating  alternatives  (includes  putting  a value 
on  applicable  risk  factors) 

6.  Assessing  the  people-task  ratio 

7.  Identifying  the  latest  and  expected  consequences  of  the 
alternative  courses  of  action 

8.  Determining  the  advantages  and  disadvantages  of  each  action 
alternative 

9.  Testing  the  validity  and  effectiveness  of  the  consequences 
of  the  decision  against  the  actual  course  of  events  to 
evaluate  the  decision  maker's  judgment  and  to  modify  his 
subsequent  decision-making  behavior 

10.  Brainstorming  action  alternatives 

11.  classifying  and  defining  the  problem  requiring  a decision 

12.  Analyzing  and  evaluating  stimuli  and  decisions  coming  in 
from  the  outside 

13.  Defining  the  goal  at  which  the  decision  is  directed 

14.  Communicating  the  decision  in  written  or  verbal  composition 

15.  Identifying  resources  bearing  on  the  making  of  the  decision 

16.  Recognizing  the  need  for  a decision 

17.  Utilizing  minor,  relatively  simple  decisions  to  contribute 
to  making  the  more  complex  one  (includes  determining  the 
hierarchy  of  order  in  which  minor  decisions  will  be  dealt 
with  and  coping  with  timing  as  alternatives  come  into  focus 
and  seemingly  demand  attention  at  the  same  time) 

18.  Obtaining  information 

19.  Specifying  the  boundary  conditions  the  decision  must 
satisfy"  (p.  433) . 
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3.2.6  Edwards 

Edwards  (1965b)  lists  the  following  thirteen  steps  that  must 
be  carried  out  by  any  Bayesian  decision  system: 

"1.  Recognize  the  existence  of  a decision  problem 

2.  Identify  available  acts 

3.  Identify  relevant  states  that  determine  payoff 
for  acts 

4.  Identify  the  value  dimensions  to  be  aggregated 
into  the  payoff  matrix 

5.  Judge  the  value  of  each  outcome  on  each  dimension 

6.  Aggregate  value  judgments  into  a composite 
payoff  matrix 

7.  Identify  information  sources  relevant  to 
discrimination  among  states 

8.  Collect  data  from  information  sources 

9.  Filter  data,  put  into  standard  format, 
and  display  to  likelihood  estimators 

10.  Estimate  likelihood  ratios  (or  some  other 
quantity  indicating  the  impact  of  the  datum 
on  the  hypotheses) 

11.  Aggregate  impact  estimates  into  posterior 
distributions 

12.  Decide  among  acts  by  using  principle  of 
maximizing  expected  value 

13.  Implement  the  decision"  (p.  142,  Tabic  1). 

Steps  1 through  5,  and  7 and  10,  Edwards  suggests,  are  best  per- 
formed by  men,  Steps  6,  11  and  12  by  machines,  and  Steps  8,  9 and 
13  by  both  men  and  machines.  Steps  1 through  7 may  be  done  in 
advance  of  the  decision  time;  Steps  8 through  13  must  be  done  at 
the  time  that  the  decision  is  to  be  made.  (See  Section  VIII  for 
a discussion  of  Bayesian  information  processing.) 


) 
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3.2.7  Schrenk 

A conceptualization  of  the  decision-process  that  we  find 
particularly  interesting  is  one  proposed  by  Schrenk  (1969)  . The 
motivation  for  developing  this  conceptualization  was  to  provide  a 
representation  of  the  decision-making  process  that  is  prescriptive 
in  the  sense  that  it  can  be  used  as  a guide  for  the  structuring 
of  decision-inaking  tasks  of  man-machine  systems,  but  which  does  not 
make  unrealistic  assumptions  about  human  capabilities.  The  con- 
ceptualization is  viewed  by  Schrenk  as  tentative,  and  in  need  of 
further  development;  however,  even  as  it  stands  it  provides  the 
system  designer  with  a great  deal  of  food  for  thought  concerning 
how  to  allocate  decision  functions  among  men  and  machines. 

Three  major  categories  of  decision  tasks,  or  phases  of  the 
decision  process  are  distinguished:  (1)  problem  recognition, 

(2)  problem  diagnosis,  and  (3)  action  selection.  Each  of  these 
phases  is  further  broken  down  into  several  components,  and  flow- 
diagrams  are  given  which  show  where  the  components  appear  in  the 
overall  process.  The  following  is  a paraphrasing  of  Schrenk’s 
description  of  each  of  these  components. 

• Problem  Recognition:  Determination  that  a problem 

requiring  a decision  exists. 

- Acquire  information:  Receipt  of  information  indicating 

that  actual  situation  differs  from  the  desired  situation. 

- Recognize  objectives:  The  decision  maker's  purpose  or 

mission! 

- Perceive  decision  need:  Perception  of  difference  between 

oFJectlves  ancf~current  situation;  may  result  from  change 
in  situation  or  in  objectives. 

- Assess  problem  urgency  and  importance : Establishment  of 
priority  of  problem,  relative  to  other  problems  demanding 
attention,  and  allocation  of  resources  for  solving  it. 

« Problem  Diagnosis;  Determination  of  the  situation  that  is 

causing  problem. 

- Define  possible  situations:  Generation  of  hypotheses 

regarding  situation. 

- Evaluate  situation  likelihoods:  Assignment  of  a priori 

probabilities  to  alternative “hypotheses. 
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- Determine  whether  more  information  is  needed:  Assessment 

of  adequacy  of  informat  ion  in  hand;  a continuing  process. 

- Identify  possible  data  sources;  If  more  information  is 
deslrecu  ~ 

- Jud^e  value  versus  cost;  To  determine  whether,  or  how, 
desired  information  should  be  acquired. 

- Seek  more  information;  Assuming  value  judged  to  be  greater 

than  cost.  ~ *" 

- Re-evaluate  situation  likelihoods;  Iterate. 

- Determine  whether  alternatives  under  consideration  account 

For  ~alT  the  data:  RecognItIon-dT  possible  ne ed  to  modi f y 

set  of  Hypotheses  being  considered. 

- Make  diagnostic  decision:  Selection  of  favored  hypothesis, 

or  possibly  oF~smaIT  set  of  weighted  alternatives. 

• Action  Selection:  Choice  of  course  of  action. 

- Define  action  goals:  Specification  of  explicit  goals. 

Including  Interim  or  subordinate  objectives. 

" Speci fy  value  and  time  criteria:  Identification  of 

re  revant~3TinerisIons  of  multidimensional  goals  and 
specification  of  time  constraints  within  which  decision 
must  be  made. 

- Weight  decision  criteria:  Establishment  of  relative 

importance  of  various  decision  criteria. 

- Specify  risk  philosophy:  Specification  of  strategy  of 

action  selection  Insofar  as  it  is  dictated  by  consideration.; 
of  balancing  risks  against  potential  gains. 

- Input  operating  doctrine:  Consideration  of  any  rules  or 

doctrine  by  which  the  decision  maker's  behavior  should  be 
guided. 

- Generate _action  alternatives:  Explicit  listing  of  reasonable 

set~oF  courses^oT  action  open  to  decision  maker. 

- Predict  possible  outcomes:  Specification  of  the  possible 

outcome  associated  with  each  of  the  potential  action 
alternatives. 

- Estimate  outcome Jjaiiis  and  losses:  Determination  of  value 

oF  possible  decision  outcomes. 

- Estimate  outcome  likelihoods:  Estimation  of  probabilities 

oF  occurrence  oF~ possIBTe  outcomes  for  each  action 

al ternative . 
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- Evaluate  ex  gee  ted  values  of  actions  versus  their  costs : 
Derivation,  fro  i preceding  two  steps,  of  expected  value  of 
each  possible  action,  and  estimation  of  associated  cost. 

- Evaluate  ac  ti ons _by_  ri^sk _E.hilosogli^:  Assessmei  : of  each 

action  alternative  in  terms  of  its  implications  for  the 
risk  philosophy  that  the  decision  maker  has  adopted. 

- Determine  whether  more  i_nl_ormat  1 ^needed  ^ As  under 

Diagnosis ; a continuing  question;  new  Information  might  be 
useful  either  for  identifying  additional  action  possibili- 
ties, or  to  improve  predictions  concerning  possible 
decision  outcomes. 

- Seek  information:  If  desired,  and  worth  cost  of  acquisition. 

- He-evaluate  action  alternatives:  I terate 

- Determine  whether  best  action  is  acceptable:  Review  of  most 

desirable "action "alterna tlve "to  assure  Its  acceptability, 

in  terms  of  the  decision  goals  and  criteria,  the  expected 
gains  from  the  choice  and  the  cost  of  making  it. 

“ Choose  course  ot  action:  The  "decision." 

- Implement  action;  Initiation  of  whatever  steps  are 
necessary  to  assure  that  the  selected  action  is  carried  out. 


The  main  fault  that  we  have  to  find  with  Schrenk's  model  is 
that  it  may  be  overly  elaborate.  It  is  doubtful  that  many 
individuals  go  through  anything  approaching  this  multistep  pro- 
cedure in  the  process  of  making  a decision.  This  is  perhaps  an 
unjustified  criticism,  inasmuch  as  Schrenk  intended  the  model  to 
be  more  prescriptive  than  descriptive.  And  whether  such  a model 
can  serve  as  a prototype  procedure  for  decision  makers  to  follow 
remains  to  be  seen.  In  any  case,  the  representation  does  serve  the 
useful  function  of  making  explicit  many  of  the  aspects  of  decision 
making  and  it  stands  as  a reminder  Lhat  decision  naking  may  be 
viewed  as  a complex  and  multifaceted  process  indeed. 

3. 3 Decision  Making  as  a Collection  ot  Problem-Solving  Tasks 

We  take  the  position  that  decision  making  is  best  conceived 
as  a form  of  problem  solving;  or,  more  specifically,  that  it 
involves  a variety  of  aspects  each  of  which  may  be  viewed  as  a 
problem-solving  task  in  its  own  right.  In  the  most  general  terms, 
the  decision  maker's  problem  is  to  behave  in  a rational,  or  at 
least  a reasonable,  manner.  To  be  sure,  the  distinctive  character- 
istic of  the  specific  problems  with  which  the  decision  maker  deals 
is  the  element  of  choice;  he  must  at  some  point  decide  upon  one 
from  among  two  or  more  alternative  courses  of  action.  While  the 
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act  of  choosing  nong  alternatives  is  central  to  decision  making, 
it  is  by  no  means  the  only  problem — or  even  necessarily  the  most 
difficult  one--th't  the  decision  maker  must  solve.  We  wish  to 
emphasize  the  importance  of  making  explicit  the  other  things  that 
must  be  done  if  one  is  motivated  to  make  the  best  possible — or 
at  least  a satisfactory — decision,  given  the  resources  at  one's 
disposal.  In  many  real-life  situations,  the  problem  of  choosing 
amonq  possible  courses  of  action  is 'far  simpler  than  that  of 
discovering  what  one’s  options  are  in  the  first  place,  or  of 
assigning  preferences  to  possible  decision  outcomes  in  a con- 
sistent way.  Also,  the  decision  maker  may  find  it  necessary  to 
make  many  preliminary  decisions  simply  by  way  of  setting  the  stage 
for  making  the  decision  which  is  his  primary  c<  cern.  For  example, 
he  will  want  to  reduce  his  uncertainty  about  tl  derision  situa- 
tion or  about  the  consequences  of  the  various  choices  hat  are 
open  to  him.  However,  the  acquisition  of  information  takes  time, 
and  may  be  costly  in  other  ways,  so  he  will  continually  be  faced 
with  the  problem  of  deciding  whether  any  additional  information 
that  he  may  wish  to  get  is  worth  the  cost  of  getting  it. 

It  is  clear  from  the  foregoing  that  there  are  many  ways  to 
classify  the  various  tasks  that  the  decision  maker  may  be  required 
to  perform.  The  scheme  that  we  find  most  satisfactory  recognizes 
eight  aspects  of  decision  making:  information  gathering,  data 

evaluation,  hypothesis  generation,  problem  structuring,  hypothesis 
evaluation,  preference  specification,  action  selection,  and  decision 
evaluation . 


This  conceptualizati on  has  an  element  of  arbitrariness  about 
it--as  does  any  other.  There  are  four  points  that  we  would  like 
to  make  in  this  regard.  First,  the  decision  to  conceptualize  the 
process  in  terms  of  eight  types  of  tasks,  as  opposed  to  some  other 
number,  is  itself  somewhat  arbitrary,  and  reflects  our  own  biases 
concerning  what  constitutes  a useful  level  of  organization.  One 
might  conceptualize  the  decision  process  at  a much  coarser  level 
and  distinguish  two  major  types  of  tasks — diagnosis  and  action 
selection — that  would  encompass  all  of  those  that  we  wish  to 
distinguish.  This  approach  has  been  taken  by  several  investigators 
(Bowen,  Nickerson,  Spooner  k.Triggs,  1970),  Kanarick,  1969  ; 

Williams  & Hopkins,  1958).  Bowen  et  al.  (1970)  point  out  that  in 
the  military,  diagnosis  is  the  proper  function  of  intelligence,  and 
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action  selection  that  of  command.  At  the  other  extreme,  one 
might  attempt  a much  finer  grained  representation  and  identify  a 
much  larger  number  of  activities  that  a decision  maker  may  be 
called  upon  to  perform.  In  this  case,  each  of  the  tasks  we  have 
identified  might  be  replaced  with  several  more  detailed  tasks. 

These  are  not  mutually  exclusive  approaches,  of  course,  and  we  will 
have  occasion  to  consider  how  some  of  the  tanks  we  have  identified 
may  be  further  broken  down.  However,  thin  level  of  analysis 
appears  to  us  to  be  the  most  useful  one  for  our  present  purpose, 
and  possibly  for  serving  as  a general  frcimework  in  terms  of  which 
to  think  about  decision  making  as  a whole. 

Second,  our  taxonomy  is  not  orthogonal  to  other  conceptuali- 
zations such  as  those  discussed  in  the  preceding  section.  It  has 
elements  in  common  with  most  of  them.  Indeed,  the  intent  is  not 
to  take  issue  with  other  taxonomies,  but  to  propose  one  that  rep- 
resents what,  in  our  view,  are  the  best  aspects  of  all  of  them. 

Third,  we  do  not  mean  to  suggest  that  whenever  an  individual 
finds  himself  performing  the  role  of  a decision  maker  he  explicitly 
runs  through  this  set  of  tasks  in  serial  fashion,  or  even  that  he 
performs  each  of  these  tasks  explicitly  at  all.  Moreover,  when 
he  does  perform  these  tasks  it  is  not  necessarily  the  case  that  lie 
is  fully  aware  of  doing  so.  It  is  characteristic  of  human  beings 
that  they  often  can  solve  problems  quite  effectively  without  having 
any  clear  idea  how  they  do  it.  This  character! sti c has  been  a 
frustration  to  researchers  in  artificial  intelligence,  who  have 
found  it  exceedingly  difficult  to  program  computers  to  perform  some 
tasks  that  human  beings  seem  to  be  able  to  perform  with  ease. 

What  we  do  mean  to  suggest  by  the  proposed  taxonomy  is  that  all  of 
these  types  of  activities  are  implicated  in  decision  making  and 
that  any  attempt  at  a thorough  discussion  of  the  decision-making 
process  must  take  account  of  them. 

Finally,  viewing  decision  making  as  a problem-solving  process 
that  is  composed  of  several  phases  or  subprocesses  emphasizes  the 
fact  that  in  any  given  decision  situation,  different  decision  tasks 
could  be  performed  by  different  individuals  or  groups  (or  machines). 
An  implication  for  training  is  that  it  may  be  less  appropriate  to 
think  of  training  decision  makers  per  sc  than  of  training  individual 
to  play  specific  roles  in  the  decision-making  process.  On  the  other 
hand,  there  will  undoubtedly  always  be  some  situations  in  which  all 
the  various  aspects  of  a decision  problem  will  be  handled  by  the 
same  individual.  But  whatever  the  case,  there  is  perhaps  somethi  j 
to  be  gained  by  making  decision  makers--or  specialist  members  of 
decision-making  groups--aware  of  the  many  facets  of  the  general  task 
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In  the  next  few  sections  of  this  report,  we  consider  each  of 
the  components  of  our  task  taxonomy  in  turn.  The  order  in  which 
the  tasks  are  discussed  represents  a natural  progression;  however, 
in  real  life  decision  situations,  an  individual,  in  a decision- 
making system,  may  perform  several  of  these  tasks  more  or  less 
simultaneously.  Or  he  may  skip  from  one  to  another  in  a variety 
of  orders,  and  may  perform  any  given  type  of  task  many  times  in 
the  course  of  attempting  to  solve  a single  decision  problem. 
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SECTION  IV 

INFORMATION  GATHERING 

From  the  point  of  view  oi.  the  decision  maker,  most  decision 
situations  are  characterized  by  some  degree  of  uncertainty.  This 
uncertainty  may  involve  the  current  "state  of  the  world,"  the 
decision  alternatives  that  are  available,  the  possible  consequences 
of  selecting  any  given  one  of  them,  and  even  the  decision  maker's 
preferences  with  respect  to  the  possible  decision  outcomes.  One 
of  the  major  problems  facing  the  decision  maker,  therefore,  is 
that  of  acquiring  information  in  order  to  reduce  his  uncertainty 
concerning  such  factors,  thereby  increasing  his  chances  of  making 
a decision  that  will  have  a desirable  outcome. 

What  makes  the  problem  interesting,  and  nontrivial,  is  the 
fact  that  information  acquisition  can  be  costly,  both  in  terms  of 
time  and  money.  Therefore,  the  decision  maker  must  determine 
whether  the  value  of  the  information  that  could  be  obtained  through 
any  given  data-coll action  effort  is  likely  to  be  greater  than  the 
cost  of  obtaining  it.  And  therein  lies  a decision  problem  in  its 
own  right. 

Ir.  theory,  one  can  see  an  infinite  regress  here.  In  order  to 
decide  whether  to  initiate  any  information-collecting  effort,  one 
must  determine  the  worth  of  the  information  to  be  collected  and 
the  cost  of  collecting  it.  But  in  order  to  determine  that,  one 
may  have  to  collect  some  information — at  some  cost,  and  so  on.  In 
practice,  of  course,  infinite  regresses  never  occur?  and  in  this 
case,  one  very  quickly  gets  to  a point  at  which  the  decision  maker 
relies  on  information  in  hand,  or  appeals  to  his  own  intuitions. 

4 . 1 Information  Seeking  versus  Information  Purchasing 

Information  gathering  may  be  thought  of  as  involving  two  quite 
different  activities;  (i)  information  seeking  (locating  the  infor- 
mation that  one  needs  or  wants) , and  (2)  information  purchasing 
(deciding  whether  information,  the-  location  of  which  is  known,  is 
worth  what  it  will  cost  to  acquire  it) . This  distinction  is  some- 
thing of  an  oversimplification,  inasmuch  as  the  act  of  seeking 
itself  typically  involves  some  cost,  and  one  often  must  decide 
whether  to  incur  that  cost  without  any  assurance  that  the  search 
will  yield  the  information  that  is  desired.  The  aspect  of  "seek- 
ing" that  we  wish  to  emphasize,  however,  is  the  need  for  identi- 
fying and  actively  searching  out  information  sources,  o£  finding 
out  where  the  desired  information  is  and  going  after  it.  The  term 
"purchasing"  is  used  tc  connote  a more  passive  role  on  the  part  of 
the  decision  maker,  the  opportunity  to  acquire  information  is  pre- 
sented to  him  and  he  need  only  indicate  whether  or  not  he  wants 
to  avail  himself--at  some  cost-— of  the  information  that  is  offered. 
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The  distinction  between  information  seeking  and  information 
purchasing  is  a useful  one  because  it  highlights  the  fact  that 
experimental  studies  have  focused  almost  exclusively  on  the  latter 
process;  although  investigators  often  have  not  made  the  distinction 
and  have  frequently  discussed  their  results  as  though  they  had  to 
do  with  the  former.  Typically,  the  decision  maker  is  presented 
with  all  the  information  that  he  needs — although  he  may  have  to 
decide  how  much  of  it  to  purchase--and  the  process  of  seeking  in- 
formation is  not  studied.  The  world  outside  the  laboratory  is 
not  nearly  so  accommodating,  however,  and  one  must  either  seek 
out  the  information  one  wants,  or  go  without  it.  Moreover,  studies 
of  information  purchasing,  while  they  tel  1 us  something  about  how 
effectively  people  can  judge  the  worth  of  information  that  is  made 
available  to  them,  shed  little  light  on  information-seeking  behavior. 

Perhaps  the  main  reason  why  information-seeking  behavior  has 
not  been  widely  studied  is  the  difficulty  of  manufacturing  situa- 
tions in  the  labr  ratory  that  are  representative  of  those  raced  by 
decision  makers  in  the  real  world.  In  any  case,  whatever  the 
reasons,  information  seeking  per  se  has  not  received  the  attention 
from  investigators  of  decision  making  that  it  deserves.  The  ex- 
periments that  we  have  reviewed  that  purport  to  deal  with  this 
topic  invariably  have  actually  studied  information  purchasing  as 
we  have  defined  that  term. 

4 . 2 Optional -5 topping  Experiments 

An  experimental  paradigm  that  has  often  been  used  to  study 
iwf-'jni  ■ tion-purchasing  behavior  is  one  in  which  the  decision  maker 
is  provided  with  the  opportunity  on  each  trial  oil  her  of  purchasing 
more  data  that  are  relevant;  to  the  decision  that  h > is  required 
to  make,  or  of  making  the  decision.  The  terms  "deterred  decision" 
"optional  stopping"  and  "optimal  stopping"  have  all  been  used  to 
refer  to  this  paradigm.  "Deferred  decision"  and  "optional  stop- 
ping" connote  the  fact  that  the  subject  in  such  an  experiment  has 
the  option  on  each  trial  of  making  a terminal  decision  or  deferring 
it  in  order  to  obtain  more  data.  "Optimal  stopping"  refers  to 
the  fact  that  when  the  situation  is  sufficiently  well-structured 
so  that  the  costs  and  payoffs  associated  with  possible  decision 
outcomes,  the  cost  and  informativeness  of  data,  and  the  decision 
maker's  objectives  are  all  known,  the  point  can  be  determined  at 
which  information  purchasing  shou Id  be  stopped  and  the  decision 
made.  The  "optional-stopping"  paradigm  xs  to  be  contrasted  both 
with  the  more  familiar  paradigm  in  which  the  experimenter  deter- 
mines how  much  information  the  decision  ma  ir  will  be  given,  and 
what  is  usually  called  the  "fixed-stopping"  paradigm  in  which  the 
decision  maker  speexfies  how  much  information  he  wishes  to  purchase, 
in  advance  of  receiving  any. 
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Often,  in  optiona  1-stopping  ex  per  iments , the  required  de- 
cision concerns  the  parameters  ol  the  distribution  from  which 
the  observational  data  are  beinq  drawn.  for  example,  one  may 
have  to  decide  whether  a sequence  of  red  and  black  poker  chips 
that  one  observes  is  drawn  from  a population  in  which  the  propor- 
tion of  reds  to  blacks  is,  say,  60-40  or  30-70.  The  question  of 
interest  in  such  experiments  is  whether  the  subject's  information- 
purchasing behavior  deviates  from  optimality,  and  if  so,  in  what 
way  ? 

What  constitutes  optimal  performance  has  been  worked  out  for 
a variety  of  specific  situations  (Birdsall  Roberts,  196  5; 

Blackwell  & Girshick,  1954;  Raiffa  Schlaifer,  1961).  For  our 
purposes  it  suffices  to  recognize  that,  in  general,  the  amount  of 
information  (number  of  observations)  that  should  be  purchased 
will  vary  directly  with  the  nu;  jnitude  of  the  costs  and  values 
associated  with  the  decision  outcomes,  and  inversely,  with  the 
cost  and  "diagnosticity " of  the  data  that  are  purchased.  Diag- 
nosticity  refers  to  the  degree  to  which  the  data  should  reduce 
the  decision  maker's  uncertainty  about  which  of  the  terminal  de- 
cision alternatives  should  be  selected.  The  diagnostic  value  of 
a datum  depends  on  several  factors  (some  of  which  are  discussed 
in  Section  VIII),  and  typically  decreases  as  the  number  of  data 
that  have  already  been  collected  increases.  A factor  that  usually 
is  not  taken  into  consideration  in  optional-stopping  experiments 
but  can  be  critical  in  real-life  situations  is  the  importance  of 
time  itself.  In  some  situations  the  potential  consequences  of  a 
decision  are  highly  time-dependent.  This  fact  can  be  incorporated 
in  an  optimal-stopping  rule  by  making  the  cost  of  an  observation, 
or  the  stopping  criterion,  a function  of  time. 

Typically,  performance  in  optional -stopping  experiments  has 
been  found  not  to  be  optimal.  Moreover,  as  illustrated  by  a 
study  by  Green,  Halbert,  and  Minas  (1964) , the  deviation  from 
optimality  may  be  in  either  direction.  In  one  experiment,  Green, 
et  al.  found  that  the  number  of  observations  purchased  increased 
with  the  a priori  uncertainty  concerning  the  correct  decision — 
as  would  be  expected  of  an  efficient  Bayesian  [rrocussor— however , 
subjects  tended  to  purchase  too  many  observations  when  the  a priori 
uncertainty  was  maximized  by  providing  no  prior  information  con- 
cerning the  likelihoods  of  the  correctness  of  the  possible  de- 
cisions. Tn  combination,  the  results  of  these  experiments  suggest 
that  decision  makers  may  sometimes  purchase  too  much  information, 
and  sometimes  too  little.  In  particular,  it  wou I appear  that 
they  may  purchase  too  much  information  if  the  a priori  uncertainty 
is  small,  and  too  little  if  the  a priori  uncertainty  is  largo. 

Many  investigators  have  used  the  optional -stopping  paradigm 
(Becker,  1958;  Edwards,  1967;  Edwards  i.  Slovie,  1965;  Fried  t, 
Peterson,  1969;  Howell,  1966;  Irwin  f.  Smith,  1 957;  Pit/..  19G8  , 1969; 
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Pruitt,  1961;  Schrenk,  1964  ; Snapper  Peterson,  1971;  Swots  & 
Birdsall,  1967).  The  results  of  most  of  those  studies  sugqost 
that  although  information  seeking  may  approach  optimal  levels 
(Becker,  1958;  Howell,  1966;  Pruitt,  1961),  there  are  reasonably 
systematic  departures  from  perfect  performance.  The  general 
finding  seems  to  be  that  too  little  information  is  sought  when 
(theoretically)  much  is  required,  and  that  too  much  is  souqht 
when  little  is  required.  The  latter  finding  fits  well  with  the 
conservatism  or  inertia  effect  often  noted  in  studies  of  Bayesian 
inference,  but  the  former  clearly  does  not. 

A few  descriptive  models  of  optional-stopping  behavior  have 
been  developed  (see,  for  examples,  Edwards,  1965a;  Pitz,  1968; 

Pitz,  Iteinhold,  & Geller,  1969).  These  models  have  been  developed 
in  a Bayesian  context  (Rapoport  & Wallsten,  1972)  and  tend  to  be 
situation  specific  (see,  for  example,  the  "World  Series  Model" 
of  Pitz,  Reinhold,  & Geller,  1969). 

Noting  that  most  optimal-stopping  experiments  had  been  con- 
cerned only  with  the  question  of  when  to  stop  acquiring  iniormation 
from  a single  source,  Kanarick,  Huntington,  and  Petersen  (1969) 
suggested  that  a more  valid  simulation  ol  some  decision— making 
situations/  e.g.,  tactical  situations,  would  recognize  that  the 
decision  maker  must  deal  with  information  from  more  than  one  souice. 
In  keeping  with  this  observation,  Kanarick  ct  al.  did  an  optional- 
stopping  study  in  which  the  decision  maker  had  the  option  on  each 
trial  of  acquiring  data  from  his  choice  of  three  sources,  or  of 
making  a terminal  decision.  The  terminal  decision  that  was  re- 
quired involved  the  presence  or  absence  of  an  enemy  submarine  in 
the  vicinity.  The  information  sources  differed,  both  with  respect 
to  the  cost  of  obtaining  information  from  them  and  with  respect  to 
the  reliability  of  the  information  obtained.  (The  topic  of  reli- 
ability of  information  will  be  discussed  more  fully  in  Sections 
V and  VIII.)  Costs  associated  with  incorrect  decisions  were  also 
manipulated.  Although  the  behavior  of  the  subjects  was  consistent 
with  the  rational  model  in  many  ways — they  were  willing  to  pay  more 
for  more  reliable  information;  how  much  information  they  collected 
before  making  a particular  decision  depended  or.  how  bad  the  con- 
sequences -would  be  if  that  decision  proved  to  be  incorrect — 
performance  was  less  than  optimal  in  several  respects.  The  sub- 
jects tended , for  example,  to  consult  the  most  reliable  (and  most 
costly)  sources  less  frequently  and  the  less  reliable  (and  less 
costly)  sources  more  frequently  than  they  should  have.  Kanarick 
et  al.  characterized  this  behavior  as  a form  of  conservatism,  a 
reluctance  to  expend  the  resources  necessary  to  obtain  the  best 
information  in  a choice  situation"  (p.  382) . The  subjects  also 
tended  to  purchase  less  data  in  general  than  they  should  have,  and, 
consequently,  made  more  incorrect  decisions  and  won  fewer  points 
than  did  a Bayesian  model  that  was  used  to  represent  optimal  be- 
havior. 
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Levine  and  Samet  (1973)  have  also  studied  information  qather- 
inq--information  purchasing  in  our  terms— in  a simulated  tactical 
situation.  The  scenario  was  a military  action  and  the  subjects’ 
task  was  to  decide  which  ol  eiqht  locations  was  the  target  of  a 
hypothetical  enemy  advance.  On  each  trial,  .1  subject  could  either 
make  a terminal  decision  or  request,  nddi  tional  information  from 
each  of  three  intclliqence  sources  concerning  the  present  where- 
abouts of  the  advancing  force.  A sequence  of  reports  from  a given 
source  represented  the  path  that  the  advancing  force  had  taken 
over  a period  of  time,  accordinq  to  that  source.  Among  the  vari- 
ables that  were  manipulated  were  the  reliability  of  the  intelli- 
gence sources,  the  degree  of  conflict  among  reports  from  different 
sources,  and  the  probability  that  a request  lor  information  would 
yield  an  updated  report  (as  opposed  to  a repetition  of  the  preced- 
ing report).  Performance  was  sensitive  to  each  of  the  variables. 

In  particular,  fewer  reports  were  requested  and  decisions  were 
more  often  correct  when  all  the  sources  were  reliable,  and  the 
quality  of  performance  tended  to  decline  as  the  percentage  of  the 
sources  that  were  unreliable  was  increased.  Increasing  the  degree 
to  which  the  sources  were  in  coni  lict  also  had  the  effect  of  de- 
creasing the  number  of  reports  requested.  (This  counterintuitive 
result  may  be  due  in  part  to  the  fact  that  as  conflict  increased 
in  this  experiment,  so  did  the  probability  that  the  correct  target 
was  indicated  by  at  least  one  of  the  sources  on  a given  trial.) 

The  number  of  requests  for  reports  decreased  as,  the  probability  that 
a given  report  would  yield  new  information  increased;  the  relation- 
ship was  such,  however,  that  the  amount  of  information  (number  of 
updates)  received  increased  with  this  variable. 

In  a subsequent  experiment,  in  which  the  same  decision  problem 
was  used,  Levine,  Samet,  and  Brahlek  (1974)  varied  the  rate  at 
which  new  reports  were  given  to  the  subject,  whether  the  reports 
were  delivered  automatically  or  in  response  to  the  .nub jec t 1 s 
request,  the  possibility  of  revising  un  initial  decision  and  the 
payoff  scheme.  In  this  case,  performance  wai;  bettor  for  the  faster 
rates  of  information  acquisition,  bill  was  not  highly  sensitive  to 
whether  the  rate  was  self-  or  force-paced.  Increasing  the  oppor- 
tunity for  revising  a decision  had  the  effect  of  decreasing  the 
accuracy  of  first  decisions  and  the  subjects’  confidence  in  them. 

4 . 3 Decision  Revision  and  LI  feet  q f ^'qnuu  Lunei  1 1 _qn 

Information  G atherTn g 

The  results  of  a few  studies  suggest  that  one's  information- 
gathering behavior  may  be  different  after  making  a decision  than 
before,  particularly  if  the  making  of  the  decision  involves  some 
sort  of  public  acknowledgment  or  commitment  (Geller  & l’  itz , 196  *1; 
Gibson  Nichol,  1964  ; Pruitt,  1961;  Soelbcrg,  1 967).  People  may 
require  more  information,  for  example,  to  change  a decision  than 
was  required  to  arrive  at  a decision  in  the  first  pi  ace  (Gibs',  a & 
Nichol,  1 964  ; Pruitt,  19G1).  This  observation  is  in  keeping  vi.th 
the  results  of  several  studies  that,  suggest  that  evidence  that  tends 
to  confirm  a favored  hypothesis  is  often  given  more  credence  than 
evidence  that  tends  to  disconl  irm  it  (tlrody,  1 965;  Geller  6.  Pitz, 
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1968;  Pitz,  Downing,  i,  Reinhold,  1967).  And  sometimes  disconfirm- 
ing  evidence  may  even  be  misinterpreted  as  supportive  ot  a decision 
that  has  already  been  made  (O.rabitz  i Jochen,  1 972). 

The  motivation  lor  acquiring  information  may  change,  following 
a decision,  from  that  of  trying  to  increase  the  probability  of 
making  a good  decision  to  that  of  justifying  or  rationalizing  a 
decision  that  has  already  been  made.  Soelberg  (1967)  has  concluded 
from  a study  of  the  job-seeking  behavior  of  graduates  of  the  Sloane 
School  that  people  frequently  make  an  implicit  selection  from  among 
the  existing  opportunities,  following  which  "a  great  deal  of  per- 
ceptual and  interpretationa ] distortion  takes  place  in  favor  ot 
the  choice  candidate"  (p.  29).  In  a somewhat  similar  vein,  Morgan 

and  Morton  (1944)  have  asserted  that  people  often  accept  conclusions 
that  are  consistent  with  their  convictions  without  regard  for  the 
validity  of  the  inferences  on  which  those  conclusions  are  based, 
and  that  "the  only  circumstance  under  which  we  can  be  r datively 
sure  that  the  inferences  of  a person  will  be  logical  is  when  they 
lead  to  a conclusion  which  he  has  already  accepted”  (p.  19).  We 
will  return  to  the  question  of  the  logicality  of  thought  in 
Section  8.3). 

One  suspects  that  in  real-world  situations  the  information- 
seeking  behavior  that  follows  the  making  of  a decision  may  often 
differ  considerably  from  that  that  precedes  it.  In  particular, 
one  would  guess  that  to  the  degree  that  the  motive  of  the  informa- 
tion seeker  is  the  rationalization  of  a decision  already  made,  the 
process  would  become  highly  selective  as  to  the  sources  consulted. 

4.4  Qy.3nti  ty  _of  _in  formation  and_Qunlity  of_  Decision 

It  is  quite  natural  to  assume  that  the  more  data  one  lias  that 
are  relevant  to  a choice  that  he  must  make,  the  better  his  choice 
will  be.  The  assumption,  without  qualification,  is  not  valid 
(Ackoff,  1967;  Fleming,  1970;  Hayes,  1964;  Hoopf I Huber,  1970; 
Sidorsky  & Houseman,  1966).  it  is  possible,  indeed  easy,  to  provide 
an  individual  with  more  information  than  he  can  assimilate  and  use— 
especially  if  he  is  operating  under  some  time  pressure.  The  point 
is  illustrated  nicely  by  an  experiment  by  Hayes. 

Hayes  had  naval  enlisted  men  make  decisions  concerning  which  of 
several  airplanes  to  displatch  to  investigate  a reported  submarine 
sighting  in  a simulated  tactical  situation.  The  available  airplanes 
differed  with  respect  to  such  characteristics  as  speed,  distance  of 
its  base  from  the  target,  delay  before  it  could  take  off,  quality 
of  its  pilot,  quality  of  its  radar,  and  so  on.  Hach  characteristic 
could  take  on  any  of  eight  (not  necessarily  numerical)  "values," 
which  could  be  ranked  unequivocally  from  best  to  worst.  The  number 
of  available  airplanes  from  which  a subject  had  to  choose  was  varied 
(4  or  8)  as  was  the  number  of  charac teristic s (2,  4,  6 or  8)  on 
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which  he  was  to  base  his  choice.  The  el  feet  ot  the  lattei  variable 
is  of  particular  interest.  Decision  time  increased  markedly  with 
this  variable;  however,  the  decision  qua  Li ty--which  was  defined  ob- 
jectively in  two  ways--did  not.  Hayes  hypothesized  that,  other 
things  equal,  one's  sensitivity  to  the  way  two  alternatives  differ 
with  respect  to  individual  characteristics  decreases  as  the  number 
of  characteristics  that  must  be  considered  increases.  Of  particular 
relevance  to  this  review  is  the  fact  that  Hayes  trained  a second  set 
of  subjects  for  sever.  1 days  to  see  it  they  would  learn  to  make 
better  decisions  with  the  larger  amounts  ol  information.  Although 
the  quality  of  decisions  was  generally  somewhat  higher  after  training 
than  before,  the  relationship  between  decision  quality  and  number  of 
characteristics  on  which  a decision  was  based  did  not  change. 

We  should  not  conclude  from  tins  study  that  one  should  never, 
under  any  circumstances,  be  provided  with  more  than  a very  few  items 
of  information  that  are  relevant  to  any  choice  that  one  may  have  to 
make.  One  might  conclude,  however:  (1)  that  decision  makers  should 

be  trained  to  recognize  their  limitations  lor  assimilating  informa- 
tion, and  to  avoid  attempting  to  operate  beyond  them,  and  (2)  that 
to  the  extent  that  the  functional  relationship  between  the  desira- 
bility of  the  various  choice  alternatives  that  are  open  to  the  de- 
cision maker  and  the  values  ot  the  factors  that  determine  it  is  known, 
the  implication  of  particular  sets  of  factor  values  should  probably 
be  computed,  and  not  estimated  by  men.  The  problem  of  determining, 
or  discovering,  such  functional  relationships  is  a nontrivial  one. 

(See  Section  IX . ) 


4.5  A Conceptual! zationoi  Information  fathering  in  the  Heal  World 

What  makes  the  real-world  decision  maker's  task  particularly 
difficult  is  the  tact  that  the  information  that  he  would  like  to 
have  typically  is  distributed  among  a variety  ot  sources.  One  way 
of  charactori zing  these  sources  is  in  terms  of  the  two  properties: 
degree  ol  passivity  and  degree  ot  eooperativeness.  According  to 
this  conceptual i zati on , a source  is  either  active  or  passive,  and 
either  cooperative  or  uncooperative. 


An  actively _coo^erative  sourcu— 
in forma tfon,  "'and.  seeks  ways  "to  get  it 
the  military  context,  an  int.el  ligence 
cooperative  source  for  a commander. 


the  preferred  type-- volunteers 
to  the  decision  maker.  In 
officer  would  be  an  actively 


A passively  cooperative  source  is  one  that  would  provide  in- 
iormation  TF  solrcrted,  but  does  not  volunt  cor  it.  A possble 
reason  for  not  volunteering  information  in  this  case  ir  a failure 
of  the  source  to  recognize  itself  as  such.  An  example,  again  from 
a military  context,  would  be  friendly  inhabitants  ot  >n  area  of 
operations  who  have  information  that  would  he  valuab  ■ to  a 
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military  commander,  but  arc  unaware  oi  the  fact.  The  problem  that 
the  decision  maker  has  vis-a-vis  passively  cooperative  sources  is 
to  identify  and  find  them. 

An  actively  uncooperative  source  has  information  that  would 
be  of  use  Fo~the  decision  maker,  but  being  motivated  to  thwart 
the  decision  maker's  objectives  if  possible,  volunteers  information 
that  is  misleading.  A propagandist  is  an  example  of  such  a source. 

The  decision  maker's  problem  with  respect  to  actively  uncooperative 
sources  is  to  recognize  them  as  such  and  to  assess  the  information 
obtained  from  them  accordingly. 

A passively  uncooperative  source  is  one  that  withholds  in- 
formation From  the  decision  maker,  and  further  will  not  provide 
it  if  asked.  Hostile  noncombatants  in  an  area  of  military  operations 
might  fit  this  description,  as  might  espionage  agents.  The  decision 
maker's  problem  with  respect  to  passively  uncooperative  sources  is 
to  persuade  them  to  change  their  status  and  become  actively 
cooperative.  History,  both  real  and  fictitious,  is  replete  with 
accounts  of  the  unsavory  methods  that  have  been  employed  to  this 
end . 

To  the  extent  that  laboratory  studies  of  decision  making  have 
been  concerned  with  information  gathering,  they  have  involved  f 

actively  cooperative  sources  almost  exclusively.  The  problem  of 
finding  sources  that  are  nonobvious  and  that  ol  coping  with  those; 
that  are  noncooperative  have  received  very  little  attention  from 
experimenters.  In  part  this  is  undoubtedly  due  to  the  fact  that 
capturing  the  essence  of  these  aspects  of  infoimation  gathering  in 
laboratory  situations  is  a very  difficult  thing  to  do.  And  the 
alternative  of  studying  these  processes  in  si tu  is  hardly  less 
difficult.  Until  such  studies  are  performed,  however,  our  under- 
standing of  how  decision  makers  go  about  gathering  - especially 
seeking  - information  so  as  to  increase  their  chances  of  making 
effective  decisions  will  remain  very  incomplete. 

4.6  Reformation  Gathering  and  Training 

We  stress  again  that  laboratory  studies  of  information 
gathering  have  failed  to  capture  Lite  complexity  of  the  problem 
that  often  faces  the  information  seeker  outside  the  laboratory. 
Consequently, very  little  is  known  about  information  seeking 
behavior  as  it  occurs  in  the  real  world.  This  is  unfortunate 
because  information  seeking  constitutes  a particularly  critical 
aspect  of  many  real-life  decision  problems  and  so  long  as  this 
behavior  is  not  well  understood,  our  understanding  of  decision 
making  will  be  incomplete. 

) 
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The  implications  for  training  are  obvious:  training  procedures 
that  are  based  on  a solid  foundation  of  factual  knowledge  about 
human  capabilities  and  limitations  cannot  be  developed  if  the  foun- 
dation does  not  exist.  The  need  is  for  research  that  is  designed 
to  answer  some  of  the  questions  that  laboratory  experiments  here- 
tofore have  failed  to  address  effectively.  Such  questions  include 
the  following.  How  good  are  people  at  identifying  sources  of 
information  that  is  relevant  to  their  decision  problems?  How  do 
they  go  about  discovering  such  sources?  How  capable  are  they  of 
assessing  the  cost  of  acquiring  information  that  may  be  difficult 
to  get  and  the  worth  of  the  information  that  might  be  obtained? 

To  what  extent  can  useful  principles  and  procedures  for  information 
seeking  be  made  explicit  and  taught?  It  is  probably  fair  to  say 
that  with  respect  to  such  questions  there  is  insur  icient  basis  for 
even  an  educated  guess  as  to  the  answer.  Clear!  there  is  need  for 
some  imaginative  research  on  this  aspect  of  the  decision-making 
process . 

Laboratory  studies  such  as  those  reviewed  above  do  shed  some 
light  on  information  purchasing  behavior.  In  particular  they  tell 
us  something  about  human  capabilities  and  limitations  in  assessing 
the  worth  of  information  in  well  structured  situations.  Although 
it  would  be  risky  to  generalize  many  of  the  conclusions  uncritically 
to  nonlaboratory  situations,  the  conclusions  nonetheless  are  sug- 
gestive of  what  should  perhaps  be  done  by  way  of  training  or  train- 
ing research. 


NAVTRAEQUIPCEN  73-C-0128-1 


SECTION  V 
DATA  EVALUATION 


In  the  preceding  section  we  used  the  words  data  and  informa- 
tion more  or  less  synonymously.  It  will  be  helpful  at  this  point 
to  make  a distinction.  The  term  data  is  perhaps  best  used  to 
refer  to  what  one  collects,  and  the  term  information  to  connote 
whatever  conclusions  or  inferences  one  draws  from  data.  The  data 
and  the  information  extracted  therefrom  can  be  identical,  but  they 
need  not  be.  For  example,  if  a military  commander  receives  data 
to  the  effect  that  the  troop  strength  of  an  opposing  tactical 
force  is  15,000  men,  and  he  considers  the  source  to  be  a reliable 
one,  he  will  undoubtedly  accept  the  dat  ’ as  accurate  and  conclude 
that  the  enemy  troop  strength  is  indeed  15,000  men.  On  the  other 
hand,  if  he  has  less  than  full  confidence  in  the  source  of  this 
report,  he  may  tentatively  conclude  that  the  troop  stre  igth  is 
somewhere  between  5,000  and  25,000  men,  and  attempt  to  gef  more 
data  from  which  he  can  derive  a more  precise  estimate. 

The  point  is  that  as  part  of  the  process  of  attempting  to 
reduce  his  uncertainty  about  his  decision  situation,  the  decision 
maker  must  evaluate  the  data  that  he  receives  as  to  their  per- 
tinence and  trustworthiness.  In  other  words,  the  first  decision 
that  the  decision  maker  must  make  with  respect  to  any  new  datum 
is  how  seriously  he  should  take  it.  Me  may  not  explicitly  do 
this  in  all  cases,  but  to  fail  to  do  so  at  least  implicitly  is 
tantamount  to  judging  his  sources  as  completely  trustworthy  and 
their  inputs  as  equally  important. 

5. 1 The  Evaluation  versus  the  Use  of  Data 

There  are  two  questions  relating  to  data  quality  that  deserve 
attention:  (1)  how  well  can  people  judge  and  report  the  quality 
of  the  data  on  which  decisions  are  to  be  based,  and  (2)  how 
effectively  can  they  utilize  iniormation  concerning  quality  of 
data  when  that  information  is  provided  tor  them?  The  first  of 
these  questions  concerns  what  we  are  referring  to  as  the  task  of 
data  evaluation,  and  is  discussed  in  this  section.  The  second 
has  to  do  with  data  utilization  ar.d  is  more  appropriately  dis- 
cussed in  connection  with  hypothesis  evaluation  in  Section  VIII. 

In  anticipation  of  the  latter  discussion,  we  note  here  simply 
that  several  experiments  have  been  addressed  to  the  question  of 
how  effectively  decision  makers  use  knowledge  of  data  quality. 

In  most  such  studies  the  performance  of  subjects  has  been  compared 
with  that  of  some  ideal  (usually  Uayesian)  model  (see,  for  examples, 
Funaro,  1974:  Johnson,  1974;  Schum,  DuCharme,  & DePitts,  1973; 
Snapper  & Fryback,  1971;  Steijer  & Ccttys,  1972).  What  is  most 
germane  to  the  topic  of  this  section  is  the  fact  that  the  models 
that  are  used  to  represent  optimal  behavior  typically  distinguish 
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two  separate  steps.  The  first  step  entails  an  adjustment  of  the 
nominal  diagnostic  value  of  a datum,  the  value  that  the  datum  would 
have  if  it  were  known  to  have  been  reliably  observed  or  reported. 
The  second  step  involves  the  application  of  the  modified  datum  to 
the  hypotheses  of  interest.  The  first  step  is  what  we  are  calling 
data  evaluation,  and  it  is  important  to  note  that,  the  failure  of 
subjects  to  perforin  this  step  properly  appears  to  be  one  of  the 
reasons  why  they  typically  acquire  less  information  from  data  that 
are  not  perfectly  reliable  than  is  there  to  be  acquired. 

5.2  Studies  of  Data  Evaluation 


Data  evaluation  has  been  recognized  by  the  U.S.  Army  as  being 
of  sufficient  importance  to  warrant  the  development  of  a rating 
procedure  for  use  by  tactical  intelligence  personnel  to  evaluate 
all  incoming  "spot  reports”  (Combat  Intelligence  Field  Manual, 
FM30-5) . The  procedure,  which  has  been  standardized  tor  use  by 
NATO  army  forces,  requires  that  a sender  of  a report  explicitly 
rate  the  report  both  with  respect  to  the  reliability  ol  its  source 
and  the  accuracy  of  its  contents.  The  letters  A through  F are 
used  to  designate  estimates  of  reliability,  and  the  numbers  1 
through  6 to  represent  judged  accuracy.  The  first  five  ratings 
represent  a scale  going  from  "completely  reliable"  (A)  to  "un- 
reliable" (E)  in  one  case,  and  from  "confirmed  by  other  sources" 

(1)  to  "improbable"  (5)  in  the  other.  The  lowest  rating  in  each 
case  is  used  to  indicate  that  a judgment  cannot  be  made:  "relia- 

bility cannot  be  judged"  (F) , "truth  cannot  be  judged"  (6). 

Obviously,  the  purpose  of  using  such  a rating  procedure  is 
to  provide  the  receiver  of  a report  with  some  indication  of  how 
much  confidence  he  should  have  in  its  contents.  How  effective  the 
procedure  has  been,  however,  is  open  to  question.  Data  collected 
during  field  exercises  have  indicated  that  ratings  often  are 
omitted  from  spot  reports,  and  that  the  ratings  that  are  used  are 
too  consistently  high  (baker,  McKendry,  & Mace,  1968).  The  same 
study  also  revealed  that  the  reliability  and  the  accuracy  ratings 
tend  to  be  highly  correlated.  One  possible  explanation  of  this 
correlation  is  that  reliable  sources  tend  to  produce  accurate 
reports.  This  is  an  intuitively  plausible  explanation,  and  it 
raises  the  question  of  the  need  for  two  ratings.  The  other  pos- 
sible explanation  for  the  correlation  is  that  the  rater  tinds  it 
difficult  to  treat  reliability  and  accuracy  as  independent  dimen- 
sions. The  results  of  a subsequent  laboratory  study  of  rating 
behavior  were  interpreted  as  supporting  the  latter  possibility 
(Samet,  1975a).  On  the  basis  ol:  his  results,  Samet  proposed  that 
an  attempt  be  made  to  design  and  validate  an  improved  procedure 
for  evaluating  intelligence  data.  Specifically,  he  suggested  the 
possibility  of  assigning  to  a report  a single  number  that  would 
represent  the  evaluator's  estimate  of  the  likelihood  of  the  report 
being  true,  based  on  all  the  iniormation  available  to  him  that 
was  relevant  to  that  judgment. 
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5 . 3 The  Use  of  Nonquantitat  ivu  (Ju a li  t iers 

Probably  most  people  who  evaluate  data  or  data  sources  do  not 
do  so  according  to  a formal  procedure  or  in  quantitative  terms. 

More  typically,  they  use  such  qualifiers  as  "usually  reliable," 

"not  very  dependable,"  "prone  to  exaggerations,"  "very  precise," 

"a  bit  careless,"  "very  likely,"  "a  rough  estimate,"  and  so  forth. 
Such  phrases  are  certainly  meaningful  and  undoubtedly  can  convey 
important  qualifying  information.  The  problem  is  that  not  all 
people  mean  the  same  thing  when  they  use  one  of  those  phrases,  and 
what  complicates  matters  is  the  fact  that  even  a given  individual 
may  use  the  same  term  to  mean  somewhat  different  things  at  dif- 
ferent times. 

A number  of  efforts  have  been  made  to  measure  the  extent  of 
agreement  between  individuals  in  their  use  of  such  qualifying 
terms.  A common  experimental  paradigm  is  that  of  provl’ing  sub- 
jects with  lists  of  terms  or  phrases  and  requiring  them  t'1  trans- 
late the  degree  of  certainty  or  uncertainty  denoted  into  a numeric 
(typically  probabilistic)  estimate.  The  variance  observed  among 
and  within  subjects  in  the  translation  then  provides  a measurement 
of  agreement.  Results  of  these  studies  (see,  for  example,  Lich- 
tenstein & Newman,  1967;  Johnson,  1973;  Sainet,  1975a,  1975b) 
typically  show  very  low  levels  of  agreement  among  subjects,  and 
the  potential  for  considerable  misunderstanding  when  large  vocabu- 
laries of  qualifiers  are  used. 

What  factors  influence  the  translation  of  a qualifier  into 
a numeric  estimate?  Thy re  seem  to  be  no  clear  answers  to  this 
question.  Cohen,  Dearnley,  and  Hansel  (1951)  suqgested  that  con- 
text in  which  a word  is  used  might  play  a role,  but  u recent  study 
by  Johnson  (19/3)  in  which  the  encoding  of  15  different  probability 
words  (or  phrases)  contained  in  each  of  three  different  sentence 
contexts  was  explored  failed  to  uncover  any  significant  context 
effect.  On  the  other  hand,  a study  by  Rigby  and  Swain  (1971)  in 
which  magnitude-denoting  terms  such  as  "couple,"  "lots,"  and 
"bunch"  wore  used  did  suqgest  such  an  effect.  For  example,  a 
"bunch  of  missiles"  had  an  average  assignment  of  7.73,  while  a 
"bunch  of  tents"  had  an  average  assignment  of  12.32.  It  seems 
obvious  on  the  face  of  it  that  nonquantitative  terms  denoting 
physical  magnitudes  must  be  subject  to  enormous  context  etfects. 
"Small"  distances  are  measured  in  angstrom  units  by  nuclear 
physicists  and  in  light  years  by  astronomers.  Indeed,  it  is 
difficult  to  see  how,  in  the  absence  of  context,  such  terms  can 
be  considered  meaningful  at  all.  Probability  terms  arc  different 
from  magnitude  terms  in  that  probabilities  are  bounded  whereas 
magnitudes  are  not.  Perhaps  this  helps  to  account  for  the  former's 
greater  independence  of  context.  it  should  be  noted  that  neither 
Johnson  nor  Rigby  and  Swain  found  significant  differences  in  the 
use  of  these  terms  due  to  group  membership  (army  enlisted  men  and 
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graduate  students,  in  the  former  study;  army  helicopter,  Air  Force 
prop,  Air  Force  jet,  and  Navy  attack  bomber  pilots  in  the  latter) . 

5.4  Data  Evaluation  and  Training 

It  seems  clear  that  full  exploitation  of  computer-based  tac- 
tical data-analysis  systems  will  ultimately  require  the  use  of 
numeric  values  in  place  of  qualitative  estimates,  if  the  relia- 
bility of  data  is  to  be  taken  into  account  when  they  are  used. 

How  best  to  arrive  at  these  values  is,  at  this  point,  a matter  of 
conjecture.  One  could  attempt  to  establish  a formal  vocabulary 
of  qualitative  terms  and  phrases,  associate  with  each  term  or 
phrase  a specific  numerical  value  (or  range  of  values) , and  train 
personnel  to  use  the  resultant  isomorphisms  in  encoding  and  de- 
coding communications.  This  is  the  essence  of  a proposal  made 
some  years  ago  by  Kent  (3ee  Platt,  1957).  Considering,  however, 
that  formal  training  would  be  a requirement  in  any  ca.  2,  a pre- 
ferred alternative  to  this  approach  is  to  instruct  decision  makers 
in  the  use  of  probability  (and  magnitude)  scales  and  require 
estimates  to  be  communicated  in  explicitly  quantitative  terms 
(Johnson,  1973;  Samet,  1975).  The  obvious  problem  for  training 
research  is  that  of  developing  effective  procedures  for  training 
people  to  evaluate  data  quantitatively  and  for  increasing  the 
intra-  and  inter-person  consistency  with  which  quantitative 
assessments  are  made. 
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iJKCTT  ON  VI 
PROBLEM  STRUCTURING 

An  exceedingly  important  step  in  solving  any  problem  is  to 
be  quite  explicit  about  what  the  problem  is  that  one  is  to  solve. 
And  one  way  to  be  explicit  is  to  attempt  to  represent  the  problem 
in  terms  of  a formal  structure.  While  the  need  to  be  explicit 
may  appear  to  be  too  obvious  to  deserve  comment,  it  is  also  ap- 
parent that  satisfying  that  need  is  not  always  an  easy  thing  to  do. 
Attempts  to  apply  computers  to  problem-solving  tasks  have  high- 
lighted both  the  need  for  explicitness  and  the  difficulty  of  ob- 
taining it.  Armor  (1964)  has  commented  on  the  frustration  that 
is  sometimes  entailed  when  one  tries  to  formulate  a problem  in 
such  a way  that  a computer  can  help  solve  it.  He  illustrates  his 
point  with  reference  to  a bank  official  who  stated,  after  having 
his  banking  procedures  mechanised  " that  65  percent  of  the  data- 
processing  group’s  effort  went  to  deciding  in  detail  wl.  t problem 
they  were  solving"  (p.  250),  Presumably,  the  investment  1 ms  worth 
it;  without  it,  they  could  not  have  recognised  a solution  had  they 
found  one. 

The  act  of  trying  to  make  the  structure  of  a problem  explicit 
can  be  an  instructive  experience  lor  a problem  solver,  inasmuch 
as  it  forces  on  him  the  realization  of  what  he  does  and  does  not 
know  about  the  problem  on  which  he  is  working--or  thinks  he  is. 
Essentially,  this  observation  is  made  by  ClooL  (1968)  vis-a-vis 
the  application  of  computers  to  the  decision  problems  of  manage- 
ment. He  takes  the  position  that  one  ot  the  major  benefits  that 
is  to  be  derived  from  an  attempt  to  implement  a computer-based 
management  information  system  is  not  the  help  that  one  would  get 
from  a functioning  system,  but  what  one  can  learn  about  the  prac- 
tice of  management  from  the  implementation  olioit..  "It  can  even 
be  argued  that  the  successful  use  of  a computer-based  MIL  should 
ue  measured  by  the  extent  to  which  managers  I earn  to  improve 
theii  performance  so  that  they  can  discard  it.  again...  'there  is 
no  doubt  that  the  changes  that  do  come  about,  wil  L Ire  due  more  to 
managers  having  a bet  ter  under  standing  of  their,  decision  processes 
than  to  the  technical  facilities  of  the  computer"  (p.  280). 

A major  contribution  of  theoretical  treatments  of  decision 
making  is  the  provision  of  formal  models  in  terms  of  which  a de- 
cision maker  can  attempt  to  structure  his  own  decision  problems. 
Invariably,  such  models  are  simplified  abstractions,  and  conse- 
quently may  not  do  justice  to  the  full  details  of  any  given  situa- 
tion. Nevertheless,  they  do  provide  one  with  structured  ways;  of 
viewing  things,  which  may  make  the  problems  easier  to  think  about, 
and  as  a consequence--hopef ul ly--easier  to  solve.  It  has  been 
suggested  that  fhis  is  the  way  in  which  quantitative  models  will 
have  their  primary  effect:  "T  believe  that  the  greatest  impact  of 
the  quantitative  approach  will  not  be  in  the  area  of  problem  sol- 
ving, although  it  wil 1 have  growing  usefulness  there.  Its  greatest 
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impact  will  be  on  problem  formulation ; the  way  managers  think 
about  their  problems  --how  they  size  them  up",  bring  new  insights 
to  bear  on  them,  and  gather  information  for  analyzing  them.  In 
this  sense,  the  results  chat  'quantitative  people'  have  produced 
are  beginning  to  contribute  in  a really  significant  way  to  the 
art  of  management''  (Hayes,  1969  , p.  108). 

6.1  State-Action  Matrices 


Probably  the  most  well-known  way  of  representing  decision  situ- 
ations is  in  teims  of  state-action  matrices.  Such  matrices  make 
three  aspects  of  decision  situations  explicit:  the  hypothesized 

possible  "states  of  the  world,"  the  action  alternatives  that  are 
open  to  the  decision  maker-  and  t ; decision  maker's  preferences 
with  respect  to  the  various  possible  state-action  combinations. 
Sometimes  such  matrices  are  referred  to  as  payoff  matrices  inasmuch 
as  each  cell  of  the  matrix  represents  the  cost  or  value — or  utility 
--to  the  decision  maker  of  the  outcome  of  a particulai  action  se- 
lection, given  that  the  associated  state  hypothesis  is  true.  A 
decision  problem  may  be  represented  in  this  way  as  follows: 

Action  Alternatives 


Hypothesized 

States 

of  the  H2 

World 


Much  of  the  theoretical-analytical  work  on  decision  making 
has  been  concerned  with  optimal  strategies  for  selecting  action 
alternati’  s once  the  situation  lias  been  formally  structured. 

Given  an  licit  decision  goal  (e.g. , minimization  of  risk, 
maximization  of  expected  gain  r and  a formal  representation 
of  the  situation,  prescriptive  models  can  provide  useful  guidance 
for  action  selection.  The  process  of  representing  real-life 
decision  situations  formally,  however,  is  at  the  present  time  more 
of  an  art  than  a science.  Examples  of  decision  situations  that 
are  easily  structured  can  always  be  found;  however,  not  all  de- 
cision problems  can  readily  be  forced  to  fit  the  same  mold. 


A, 


A 


'll 


' 12 


'21 


U . 
11 


ran 
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Even  qiven  that  the  structure  shown  above  is  an  appropriate 
one  for  a particular  problem,  it  is  clear  that  in  order  to  use  it 
one  must  be  able  to  specify,  as  a minimum,  what  the  hypothesized 
states  of  the  world  are,  what  one's  action  options,  are,  and  how 
the  various  possible  decision  outcomes  ( state-action  pairs)  relate 
to  the  value  system  that  will  determine  the  desirability  ol  the 
actual  outcome.  One  may  find  it  necessary  to  engage  in  a consider- 
able amount  of  information  seekinq  in  order  to  fill  out  such  a 
structure.  Moreover,  how  one  does  fill  out  the  structure  is  deter- 
mined in  part  by  exogenous  variables  over  which  one  has  no  control, 
and  in  part  by  self-imposed  constraints.  The  state  of  the  world 
tends  to  be  beyond  one's  control;  all  one  can  do  is  attempt  to 
determine  what  it  is  likely  to  be.  One's  action  alternatives, 
however,  may  be  constrained  in  part  by  limits  that  are  self-imposed 
What  are  viewed  as  viable  strategic  military  options,  for  example, 
may  depend  on  the  particular  military  doctrine  in  vogue  at  the  time 
Benington  (1964)  points  out  that  the  basic  concept  boh : nd  the 
development  of  such  automated,  or  semiautomated,  systems  's  the 
SAGE  system  in  the  1950’s  was  the  concept  of  "set-piece  warfare." 
"Set-piece  warfare  is  characterized  by  warning  of  threat,  (otal 
and  preplanned  goals,  speed  ot  response,  and  detailed  and  precise 
management  of  the  campaign"  (p.  9) . Emphasis  is  on  massive  re- 
taliation totally  preplanned,  or  "spasm"  response.  During  the 
early  1960's,  the  set-piece  warfare  idea  lost  favor.  President 
Kennedy  and  Secretary  of  Defense  McNamara  began  to  emphasize  the 
importance  of  flexibility  and  adapt abil ity , the  ability  to  make 
selected  and  controlled  responses,  directed  toward  military  (non- 
civilian) targets  and  appropriate  to  the  (not  always  foreseeable) 
contingencies  that  elicit  them.  clearly,  the  sc  ot  action  alter- 
natives that  the  strategist  will  consider  under  one  ol  these  re- 
taliation doctrines  is  quite  different  from  that,  that  he  will 
consider  under  the  other. 

6.2  Alternative  .Structurings  of  a Given  Situation 

It  is  apparent  that  to  think  in  terms  of  the  structure  of  a 
decision  space  is  to  oversimplify  matters  greatly.  Usually  any 
given  situation  can  be  structured  in  a variety  of  ways.  Moreover, 
how  one  chooses  to  represent  a particular  si tuation  may  not  be 
incidental.  It  seems  to  be  true  of  problem  solving  in  general 
that  how  one  represents  a problem  can  be  an  important  faetur  in 
determining  how  easily  out?  can  then  solve  it.  This  point,  has  often 
been  made  by  individuals  engaged  in  efforts  to  program  computers 
to  perform  inte]  1 oc tual  iy  demanding  tasks  (set-,  for  example, 
Nilsson,  1971).  The  same  problem  may  yield  to  attempts  to  solve 
it  when  represented  in  one  way  while  resisting  such  attempts  when 
represented  in  another. 

An  important  aspect  of  developing  a useful  structure  is  that 
of  conceptualizing  a situation  .it  an  appropriate  level  of  detail. 
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Too  sample  a structure  may  violate  the  comp  1 exity  ol  an  actual 
situation.  On  the  other  hand,  Charles  Pierce's  maxim  that:  "a  tew 
clear  ideas  are  worth  more  than  many  confused  ones"  seems  particu- 
larly apro;  s here.  We  state  as  a conjecture  that  a necessary 
requisite  j >r  effective  decision  making  is  the  ability  to  get 
quickly  to  the  heart  of  a problem,  fo  concentrate  on  essentials, 
and  to  ignore  irrelevancies.  What  this  often  means  in  practice  is 
being  able  to  see  through  super! ic ialities  that  frequently  obscure 
underlying  issues.  Moreover,  even  when  the  situation,  stripped  of 
incidentals,  is  inherently  complex,  there  may  be  some  merit  in  a 
simplified  conceptualization  of  it,  provided  that  the  tact  that  the 
conceptualization  is  a simplification  is  not  then  promptly  forgotten. 
There  is  little  to  be  gained  by  representing  a situation  in  such 
a complex  way  that  the  decision  maker  cannot  grasp  the  representation 
intellectually.  What  constitutes  an  optimal  level  of  detail  may 
vary  from  situation  to  situation  and  from  individual  fo  individual, 
but  variability  in  this  regard  may  not  be  very  great.  We  suspect 
that  for  the  vast  majority  of  situations  and  decision  m.  '■  ’rs  a 
representation  that  involves  more  than  eight  or  ten  hypothesized 
states  of  the  world  and  as  many  action  alternatives,  at  any  given 
level  of  description,  will  prove  to  be  an  unwieldy  one. 


6 . 3 Structuring  as  an  Iterative  Process 


On  the  basis  of.  an  anl.aysis  of  protocols  obtained  in  his 
classical  study  of  problem  solving,  Duneker  (1345)  reached  a con- 
clusion that  is  germane  to  the  issue  of  problem  structuring.  The 
problem  that  he  used  most  frequently  in  his  studies  was  the  now 
well-known  radiation  problem:  "given  a human  being  with  an  inoperable 
stomach  tumor,  and  rays  which  destroy  organic  tissue  at  sufficient 
intensity,  by  what  procedure  can  one  free  him  of  the  tumor  by  these 
rays  and  at  the  same  time  avoid  destroying  the  healthy  tissue  which 
surrounds  it?"  (p.  2H  ) . The  conclusion  1 hat  Duneker  came  to  after 
observing  the  efforts  of  many  people  to  solve  such  problems  was 
that  the  development  of  a solution  typically  proceeds  from  the  more 
general  to  the  more  specific.  (On  this  point,  see  also  Hogarth, 

1974,  and  Kleinmutz,  1968.)  The  principle  by  which  the  problem 
is,  hopefully,  to  be  solved  emerges  first,  and  the  details  of  the 
solution  come  later.  It  often  happens  that  a principle  may  be 
valid,  but  there  turns  nut  to  be  no  feasible  way  to  implement  it. 

A principle  that  was  frequently  identified  in  the  case  of  the  radi- 
ation problem,  for  example,  was  "avoid  contact  between  rays  and 
healthy  tissue."  When  the  problem  solver  could  think  of  no  way  to 
do  this  and  still  get  the  rays  to  the  tumor,  he  had  to  abandon  the 
principle  itself --even  though  it  was  a sound  one— and  search  for 
another  that  was  not  only  sound  but  pi acticable . 


The  finding  of  a new  principle,  or  a general  property  of  a 
solution,  always  involves,  Duneker  suggests,  a reformulation  of 
the  original  problem.  In  the  case  of  the  example  just  given,  having 
accepted  "avoiding  contact"  as  a valid  principle,  one  lias  in  effect 
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just  this.  When 
the  substitution 


defined  his  problem  as  that  uf  finding  a way  to  do 
forced  to  reject  a given  principle  as  impractical , 
of  another  (e.g.,  "lower  the  intensity  of  the  j ays  on  their  way 
through  healthy  tissue")  in  effect  defines  another  how-to-do-it 
problem  to  be  solved.  "We  can  accordingly  describe  a 
as  development  of  the  solution  or 


process  of 

_ , as  development 

livery  solution-principle  found  in  the  process, 
not  yet  ripe  for  concrete  realization. . . functions 
reformulation,  a:;  sharpening  of  the  original  setting 
It  in  L here  muni  nyfu  l to  nay  that  wha  t.  in 

any  no  l/i  Lion  of  ptn>hlemv  eonnia  tn  in  formulating  the 
problem  more  product i Only . To  sum  up:  Vhe  final  form  of  a volution 

in  typical  ly  attained  by  way  of  mediating  p ha  non  of  the.  pmonvi), 
of  which  each  one,  in  retrospect,  pom;  enner,  the  ehura,;  ter  of  a 
volution,  and,  in  provpeof,  that  of  a problem"  (p.34,  italics  his). 


solution  either 
of  the  problem, 
which  is  itself 
from  then  on  as 
of  the  problem. 
rea  l l >/  done  in 


It  is  probably  the  case  that  complex  decision  probi  ms,  like 
other  types  of  complex  problems,  yield  grudgingly  to  attempts  to 
structure  them.  Moreover,  a decision  maker  rnay  find  it  necessary 
to  formulate  and  reformulate  a decision  space  several  times  before 
arriving  at  a structure  that  he  feels  adequately  represents  the 
decision  problem  that  he  must  solve  and  does  so  in  a way  that 
facilitates  arriving  at  a solution.  The  willingness  to  discard 
a favored  conceptual  framework  when  it  is  seen  no  longer  to  fit 
the  facts  in  hand  has  been  considered  by  some  to  be  one  of  the 
defining  characteristics  of  original  thinking  (Mackworth,  1965; 
Folyani,  1963). 


6 . 4 Problem  Structuring  and  Training 

The  question  of  how  to  train  decision  makers  to  structure 
decision  problems  effectively  has  received  very  little  attention. 
Moreover,  if  it  is  true,  as  lOdwards  (1973)  lias  suggested,  that  of 
the  several  aspects  of  decision  analysis  the  process  of  problem 
structuring  is  least  amenable  to  formal  prescription,  exactly  what 
should  be  taught  is  not  clear. 


It  seems  likely,  however,  that  something  is  to  be  gained  by 
familiarizing  decision  makers  with  such  formal  representations— 
models— of  decision  situations,  as  are  provided  by  decision  theory 
and  game  theory.  Such  training  should  be  conducted  in  such  a way 
as  not  to  leave  the  student  with  the  unrealistic  idea  that  all 
decision  situations  arc  readily  represented— wi  thout  distortion— 
by  the  same  model. 


Practice  in  "■presenting  specific  situations  in  terms  of  such 
models,  nd  criteria  for  judging  the  relative  merits  of  different 
models  foi  different  problems  should  probably  be  part  of  any 
training  program  in  decision  making.  Practice  in  representing  a 
given  decision  problem  at  different  levels  of  detail  also  would 
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probably  be  beneficial.  Duncker 1 work  suggests  that  one  approach 
to  problem  structuring  that  might  usefully  be  taught  is  that  of 
zeroing  in  on  an  appropriate  formulation  by  a series  of  approxima- 
tions, proceeding  from  the  more  general  to  the  more  detailed. 

But  these  are  only  conjectures.  The  fact  is  that  little  is 
known  about  how  to  train  a person  to  be  good  at  imposing  structure 
on  a problem — whether  it  be  a decision  problem  or  a problem  of 
any  other  kind.  Mackworth  (1965)  has  noted  that  one  of  the  char- 
acteristics of  creative  individuals  is  an  exceptionally  strong 
need  to  find  order  where  none  appears  on  the  surface.  If  this 
is  so,  then  one  way  to  train  people  to  be  better  problem  structurers 
is  to  train  them  to  be  more  creative.  If  only  wo  knew  how  to  do 
that ! 


An  alternative  to  training  decision  makers  to  formalize  their 
decision  problems  is  to  provide  them  with  models  that  wre  appro- 
priate to  their  particular  situations,  and  that  can  then  no  used 
as  decision  aids.  Gorry  (1970)  has  suggested  this  possibility. 

A model  that  is  to  be  used  by  a decision  maker  need  not  be  genera- 
ted by  him,  but,  Gorry  points  out,  it  may  be  derived  from  his 
description  of  the  situation,  and  it  must  be  thoroughly  under- 
standable by  him.  In  this  case  the  training  task  becomes  that  of 
teaching  an  individual  to  make  effective  use  of  the  structure  that 
someone  else  lias  imposed  upon  his  problem. 

At  least  one  study  has  been  addressed  to  the  question  of  the 
subtasks  in  terms  of  which  one  class  of  decision  makers  sees  deci- 
sion making  and  how  this  view  would  change  as  a result  of  training. 
Hill  and  Martin  (1971)  gave  secondary-school  teachers  problem- 
solving exercises  designed  to  train  them  with  respect  to  some  of 
nineteen  specific  skills  that  they  associated  wit);  decision  making 
and  to  acquaint  them  with  a particular  model  of  the  decision-making 
process  (see  Section  III).  Both  before  and  after  training,  the 
subjects  were  asked  to  list  the  specific  steps  that  they  would 
take  in  an  effort  to  solve  a hypothetical  problem  involving  an  inter- 
person conflict.  Perhaps  the  most  striking  aspect  of  the  results 
was  how  large  a proportion  of  the  steps  that  suLjects  listed  tell 
in  the  " formulating-action-al ternatives"  category.  Before  training/ 
more  of  the  listed  steps  fell  in  this  category  than  in  the  other 
five  combined.  The  main  effect  of  training  was  to  reduce  the  num- 
ber uf  steps  in  this  category  by  about  two-thirds  and  to  increase 
the  usage  of  some  of  the  other  categories  slightly;  but  lormulating 
alternatives  still  remained  the  largest  category.  The  investigators 
concluded  that  training  had  made  the  participants  more  aware  of  t.he 
several  activities  involved  in  decision  making,  but  pointed  out 
that  their  results  shed  no  light  on  the  question  of  whether  much  as 
increased  awareness  would  produce  better  decision  making. 
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SECTION  VII 
HYPOTHESIS  GENEKAT I ON 

Hypothesis  generation  is  closely  associated  with  problem 
structuring.  Wo  find  it  convenient  to  consider  it  separately, 
however,  because  it  is  a more  narrowly  focused  type  of  activity. 
Problem  structuring  is  always  important.  Even  when  complete  in- 
formation is  available  concerning  the  state  of  the  world,  the 
action  alternatives  and  all  the  possible  decision  outcomes,  it  is 
still  necessary  to  cast  the  problem  into  some  mold,  and  the  mold 
that  is  chosen  may  have  much  to  do  with  the  decision  that  is  made. 
Hypothesis  generation,  on  the  other  hand,  is  a necessary  activity 
in  those  decision  situations  characterized  by  uncertainty  about 
such  things  as  the  state  of  the  world  and  the  implications  of 
selecting  specif  ic  decision  alternatives.  Of  ton , In  spite 
of  one's  best  efforts  to  gather  information,  it  i:.  not  possible 
to  eliminate  uncertainty  about  these  things  completely.  In  such 
cases,  it  is  convenient  to  conceptualize  the  decision  mak-'r’s  view 
of  the  situation  as  a set  of  conjectures,  or  hypotheses. 

7 . 1 Hypothesis  Generation  versus  Hypot he  sis  T e : ; tin g 

Investigators  of  cognitive  processes  have  long  recognized 
two  rather  different  types  of  thinking.  Bartlett  (19bb)  speaks 
of  closed  versus  adventurous  thinking,  Guilford  (1963)  of  conver- 
gent versus  divergent  thought.  Mackworth  (L965)  distinguishes 
problem  solvers  and  problem  finders.  The  one  kind  of  thinking 
tends  to  be  deductive  and  analytical;  the  other  inductive  and 
analogical.  The  tirsf  has  to  do  with  evaluating  hypotheses,  the 
second  with  generating  them.  The  history  of  science  attests  to 
the  fact  that  the  ability  to  evaluate  hypotheses,  to  deduce  the 
implications  of  theories  and  put  them  to  empirical  test,  is  a far 
more  common  quality  among  men  than  is  the  ability  to  generate 
hypotheses,  to  construct  theories  that  organize  and  structure 
facts  that  were  rot  perceived  as  related  before. 

Some  formal  treatments  of  decision  making  require  that  the 
situation,  as  viewed  by  the  decision  maker,  be  conceptualized  as 
a set  of  mutually  exclusive  and  exhaustive  hypotheses,  each  of 
which  represents  one  of  the  possible  states  of  the  world.  As  data 
are  gathered,  they  arc  used  to  modify  a set  ol  probabilities,  each 
of  which  represents  the  decision  maker's  estimate  of  the  likelihood 
that  a given  hypothesis  is  true.  Much  laboratory  experimentation 
has  been  devoted  to  the  question  oT  how  effectively  man  can  assi- 
milate data  and  use  it  to  modify  his  view  of  the  world  as  implied 
by  the  probabilities  that  lie  associates  with  the  hypotheses  that 
he  is  entertaining.  (We  will  consider  that,  problem  in  the  fol- 
lowing section.)  However,  very  little  attention  has  been  given 
to  the  question  of  how  capable  people  arc  ol  generating  a reason- 
able set  ol  hypotheses  to  begin  with,  or  of  modifying  the  s<  t 
when  the  need  to  do  so  arises. 
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Typically,  all  of  the  hypotheses  that  are  to  be  considered 
are  provided  for  the  decision  maker  in  advance,  so  the  process  of 
hypothesis  generation  is  r.ot  studied.  Moreover,  formal  decision 
procedures  usually  permit  the  decision  maker  only  to  update  the 
probabilities  that  have  been  assigned  to  the  previously  established 
set  of  hypotheses.  They  fail  to  recognize  the  fact  that  it  may 
be  the  case  in  real-life  situations  that  a set  of  hypotheses  that 
is  originally  developed  may  not  contain  the  hypothesis  that  will 
eventually  prove  to  be  the  true  one.  It  often  occurs  in  real-life 
situations  that  incoming  data  sugqest  to  the  decision  maker  new 
hypotheses  that  have  not  yet  been  considered.  Any  decision-making 
procedure  that  purports  to  be  generally  valid  must  provide  for 
establishment  of  new  hypotheses  whenever  the  information  in  hand 
indicates  the  need  for  them. 

7 . 2 Importance  of  Hypothesis  Generation 

The  importance  of  the  function  of  hypothesis  generut. ion  can 
hardly  be  overemphasized.  To  be  sure,  one  may  think  of  some  de- 
cision contexts  for  which  all  the  potentially  interesting  hypotheses 
can  be  specified  in  advance.  For  example,  it  may  be  the  case  for 
some  straightforward  troubleshooting  situations  that  an  exhaustive 
set  of  the  hypotheses  of  interest  can  be  listed  prior  to  the  per- 
formance of  any  tests.  Mure  typical  of  complex  decision  problems, 
however,  is  the  case  in  which  the  set  of  possibilities  is  either 
not  fully  known,  or  too  large  to  be  listed  exhaustively.  The 
problem  of  the  physician  who  is  attempting  to  diagnose  an  illness 
with  a set  of  symptoms  that  does  not  fit  a common  pattern,  or  the 
investor  who  is  trying  to  gauge  the  risks  and  potential  gains  in 
a speculative  financial  venture,  or  the  computer  programmer  who  is 
tracking  down  an  elusive  bug,  or  the  tactician  who  is  trying  to 
assess  the  significance  of:  some  unorthodox  behavior  on  the  part 
of  a wily  opponent  is  less  that  of  testing  prespecif ied  hypotheses 
than  that  of  defining  hypotheses  that  it  would  make  sense  to  con- 
sider . 

\ 

The  difficulty  is  not  so  much  that  ot  representing  a decision 
situation  in  terms  of  a set  of  possible  states  of  the  world  that 
is  exhaustive  and  mutually  exclusive.  The  problem  is  that  of 
coming  up  with  a set.  of  possibilities  that  is  useful  from  the 
decision  maker's  point  of  view.  A military  coimTuVncier  can  always 
represent  the  alternatives  that  are  open  to  an  adversary  in  terms 
of  such  gross  action  categories  as  attack,  defend,  and  withdraw, 
and  the  ability  to  distinguish  among  these  possibilities  would 
undoubtedly  be  of  interest.  However,  a commander's  doci sion-makinq 
responsibilities  typically  require  much  more  precise  information 
than  would  be  provided  by  the  resolution  of  the  uncertainty  implicit 
in  these  three  possibilities.  That  is  to  say,  he  wants  to  know  not 
only  whether  enemy  forces  plan  to  attack,  but  at  what  time,  in  what 
strenqth,  al  what  locations,  and  so  forth.  It  is  at  this  level  oi 
representation  that  the  commander's  (or  perhaps  his  intelligence 
officer's)  hypothesis-generation  capabilities  are  put  to  the  test. 
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7 . 3 Experiments  on  Hypothesis  Generation 

The  study  of  hypothesis  generation  in  the  laboratory  has 
often  involved  "concept  attainment"  or  "discover  the  rule"  type 
tasks.  The  work  of  Bruner,  Goodnow,  and  Austin  (1956)  illustrates 
the  use  of  concept  attainment  tasks  to  study  this  aspect  of  think- 
ing. In  a typical  experiment,  a subject  attempts  to  identify  a 
concept  that  an  experimenter  has  in  mind.  The  concept  usually  is 
defined  in  terms  of  conjunctions  or  disjunctions  of  specific  stim- 
ulus attributes  (e.q.,  "red  and  square";  "blue  or  yellow,  and  not 
circular”).  In  some  situations  the  subject  is  shown  stimuli,  some 
of  which  belong  to  the  conceptual  category  thnL  )>e  is  attempting 
to  identify  and  some  of  which  do  not.  He  is  told  which  stimuli 
are  which  and  from  this  "exemplar"  information  he  is  to  attempt  to 
identify  the  concept.  Sometimes  the  subject  chooses  the  stimuli 
that  he  sees,  in  which  case  the  task  can  also  be  used  to  study  a 
form  of  information-gathering  behavior. 

Obviously,  the  performance  of  this  task  involves  hypothesis 
testing  (a  topic  to  which  we  will  turn  in  the  following  section) , 
but  the  key  problem  is  that  of  hypothesis  generation.  Unless  one 
comes  up  with  the  right  hypothesis  to  test,  the  testing  that  he 
does  will  only  eliminate  some  of  the  untenable  possibilities,  of 
which  there  may  be  many. 

A basic  conclusion  that  Bruner  et  al.  draw  from  their  experi- 
mental results  is  that  the  strategies  that  subjects  employ  in  these 
sorts  of  tasks  can  be  isolated  and  described.  They  identify  four 
such  strategies,  for  example,  that  subjects  use  when  they  have  the 
job  of  discovering  a conjunctive  concept  by  selecting  stimuli  and 
being  told,  concerning  each  stimulus  selected,  whether  or  not  it 
is  an  exemplar  of  the  concept  that  they  are  attempting  to  identify. 
These  strategies  differ  in  terms  of  the  balance  they  strike  among 
three  parameters:  the  amount  o(  information  obtained  from  an  ob- 

servation, the  cognitive  strain  imposed  on  the  subject  (amount  of 
information  that  must  be  carried  in  memory,  extent  to  which  involved 
inferences  must  be  made),  and  the  risk  that  the  strategy  will  fail. 
The  strategies  are  defined  in  terms  of  the  nature  of  the  hypotheses 
that  are  generated  and  jrut  to  the  test.  In  one  case,  for  example 
("successive  scanning"),  one  specific  concent  is  hypothesized  at 
a time,  and  stimuli  are  chosen  in  such  a way  as  to  test  that  hy- 
pothesis directly.  In  another  cast!  ("conservative  focusing"), 
the  initial  hypothesis,  in  effect,  includes  several  possible  con- 
cepts and  an  attempt  is  made  to  discover  the  defining  attributes 
systematically  one  at  a time.  Which  of  the  several  strategies  is 
most  appropriate  depends  on  the  detail s of  the  experimental  situa- 
tion. 

Bruner  et  al.  found  that  the  strategies  that  subjects  use 
tend  to  change  appropriately  in  response  to  changes  in  the 
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experimental  situation;  and,  on  balance,  these  investigators  con- 
sidered the  performance  of  their  subjects  to  be  quite  good.  In 
their  words:  "In  general,  we  are  struck  by  the  notable  flexibility 
and  intelligence  of  our  subjects  in  adapting  their  strategies  to 
the  information,  capacity,  and  risk  requirements  we  have  imposed 
on  them.  They  have  altered  their  strategies  to  take  into  account 
the  increased  difficulty  of  the  problems  being  tackled,  choosing 
methods  of  information  gathering  that  were  abstractly  less  than 
ideal  but  that  lightened  pressures  imposed  on  them  by  the  tasks 
set  them.  They  have  changed  from  safe-but-slow  to  risky-but-fast 
strategies  in  the  light  of  the  number  of  moves  allowed  them.  They 
have  shown  themselves  able  to  adapt  to  cues  that  were  less  than 
perfect  in  validity  and  have  shown  good  judgment  in  dealing  with 
various  kinds  of  payoff  matrices.  They  have  shown  an  ability  to 
combine  partially  valid  cues  and  to  resolve  conflicting  cues" 

(p.  238). 

Performance  was  not  ideal,  however.  Among  the  linii “ations 
that  were  noted  were  a tendency  to  persist  in  focusing  on  cues 
that  had  proved  to  be  useful  in  the  past  even  if  they  were  not 
useful  in  the  present,  and  an  inability  to  make  as  effective  use 
of  information  gained  from  noninstances  of  a category  as  of  that 
gained  from  category  exemplars. 

Bruner  et  al.  also  found  that  concepts  defined  in  terms  of 
disjunctions  of  stimulus  attributes  were  more  difficult  to  discover 
than  those  that  were  conjunctively  defined.  This  finding  has  been 
corroborated  by  Neisser  and  Weene  (1962)  who  used  a large  variety  of 
attribute-combination  rules.  Not  surprisingly,  concepts  defined 
in  terms  of  the  presence  or  absence  of  a single  attribute  are 
easier  to  attain  than  are  those  defined  in  terms  of  conjunctions 
or  disjunctions  of  two  or  more  attributes,  which  in  turn  are 
easier  than  those  defined  in  terms  of  more  complex  rules  involv- 
ing combinations  of  conjunctions  and/or  disjunctions  (Haygood 
& Bourne,  1965;  Neisser  & Weene,  1962). 

Another  experimental  task  that  has  been  used  to  study  hypo- 
thesis generation  is  that  of  discovering  the  rule  by  which  a 
specific  sequence  of  numbers  or  letters  was  generated.  Typically, 
the  subject  is  shown  one  or  more  sequences  (or  segments  of  se- 
quences) that  satisfy  the  rule.  He  then  can  propose  other  se- 
quences, or  continuations  of  the  segment,  in  order  to  test  the 
validity  of  tentative  hypotheses  that  he  may  wish  to  consider. 

Each  time  he  proposes  a possibility  he  is  told  whether  it  satis- 
fies the  rule;  and  when  he  feels  he  has  obtained  enough  information 
to  justify  doing  so,  he  is  to  state  the  rule. 

Again,  performance  of  this  task  obviously  involves  information 
gathering  and  hypothesis  testing  as  well  as  hypothesis  generation, 
but  hypothesis  generation  is  in  some  sense  central.  What  information 
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is  sought,  is  likely  to  depend  strongly  on  what  ruLe  is  lx.' inq  con- 
sidered. Moreover,  unless  the  correct  rule  is  hypothesized  at 
some  point,  it  cannot  be  tested  and  validated. 

The  results  of  experiments  along  these  lines  have  revealed 
some  interesting  deficiencies  in  hypothesis-generation  behavior 
which  appear  to  stem  from  a lack  of  understanding  of  some  basic 
rules  of  logic.  Wason  (1974)  has  described  some  results  that 
suggest  that  people  may  have  particular  difficulty  in  discovering 
rules  that  are  sufficiently  general  that  they  subsume  many  rules 
that  are  more  specific.  for  example,  the  rule  "any  three  numbers 
in  increasing  order  of  magnitude"  proved  to  be  particularly  dif- 
ficult for  his  subjects  to  discover.  If,  as  examples  of  triads 
that  conform  to  this  rule,  a subject  were  given  (8  10  12),  (14 

16  18)  and  (20  22  24),  he  might  quickly  generate  the  hypothesis 
"successive  even  numbers,"  test  it  with  other  sequencer,  that 
satisfy  it,  and  then  announce  this  rule  with  confidence.  What  is 
disappointing  about  this  behavior  is  the  failure  to  hypoLnesize 
alternative  rules  to  which  the  given  sequences  also  conform,  and 
then  to  consider  sequences  that  would  discriminate  between  the 
alternatives  hypothesized.  More  disturbing,  however,  is  the 
finding  that  even  when  told  oi  the  incorrectness  of  a hypothesis, 
and  presented  with  conclusive  infirm  evidence,  subjects  some- 
times insisted  that  their  hypothesised  rule  was  validated  by  the 
fact  that  all  the  test  sequences  that  they  generated  conformed 
to  it. 

Two  other  results  noted  by  Wason  are  relevant  to  the  problem 
of  hypothesis  generation,  because  they  also  demonstrate  how  the 
process  can  get  bogged  down.  First  is  the  possibility  of  perse- 
veration with  an  invalidated  hypothesis  without  recognizing  that 
one  is  porsevera ting . lie  notes,  in  this  regard,  that  what  subjects 
often  do  when  informed  that  a hypothesized  rule  is  not  the  correct 
one  is  to  generate  additional  triads  that  are  consistent  with  that 
rule  and  then  announce  the  same  rule  expressed  in  different  terms. 
Second  is  a tendency,  when  hypothesized  rules  are  invalidated,  to 
generate  more  and  more  complex  rules  rather  than  simpler  ones. 

The  following  example  is  given  of  a third  generation  rule  produced 
by  one  subject:  "The  rule  is  that  the  second  number  is  random, 

and  either  the  iirst  number  equals  the  second  minus  two,  and  the 
third  is  random  but  greater  than  the  second;  or  the  third  number 
equals  the  second  plus  two,  and  the  first  is  random  but  less  than 
the  second"  (p.  382).  Recall  that  the  correct  rule  was  "any  three 
numbers  in  increasing  order  of  magnitude."  One  conclusion  that 
may  be  drawn  from  this  type  of  experimental  finding  is  that  the 
discovery  of  a general  rule,  even  though  conceptually  simple,  may 
be  impeded  by  the  discovery  ot  more  specific  rules  whose  exemplars 
are  also  exemplars  of  the  more  general  rule. 
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7 . 4 Hypothesis  Generation  and  Training 

Hypothesis  generation  represents  the  same  sort  of  challenge 
to  training  and  training  research  as  does  problem  structuring. 

The  basic  need  in  both  cases  is  for  a greater  understanding  of 
how  to  promote  creative  thinking. 

A specific  problem  that  deserves  attention  from  training 
specialists  is  that  of  perseveration.  Results  such  as  those  ob- 
tained by  Bruner,  Goodnow,  and  Austin  (1956)  and  by  Wason  (ln74) 
indicate  the  need  for  training  procedures  designed  to  improve 
ttjie  ability,  or  increase  the  willingness,  of  decision  makers  to 
generate  alternatives  to  the  hypothesis,  or  hypotheses,  under 
consideration.  They  demonstrate  the  importance  of  sensitizing 
decision  makers  to  the  danger  of  accepting  a hypothesis  on  the 
basis  of  insufficient  evidence,  and  to  the  fact,  that  the  best  way 
to  avoid  this  mistake  is  to  attempt  to  generate  plaus.ble  alter- 
natives and  to  seek  the  kind  of  data  that  will  be  most  likely  to 
discriminate  among  them. 
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SECTION  VIII 
HYPOTHESIS  EVALUATION 


Narrowly  defined,  hypothesis  (.'valuation  rulers  to  the  process 
of  applying  data  to  the  assessment  of  the  likelihoods  of  one’s 
hypotheses  concerning  the  unknowns  of  the  situation.  More  generaly, 
the  term  might  be  used  to  connote  the  process  of  extracting  informa- 
tion from  data,  of  attempting  to  reduce  one’s  degree  of  uncertainty 
about  the  parameters  of  the  decision  space.  In  some  formally  struc- 
tured approaches  to  decision  making,  hypothesis  evaluation  may  involve 
the  revisions  of  numerical  probability  estimates  or  other  quantita- 
tive indicants  of  relative  likelihoods.  In  other  cases  the  process 
may  be  less  explicit,  but  it  is  not  for  that  reason  less  important. 

We  assume  that  even  in  situations  that  have  been  given  little  formal 
structure,  the  decision  maker  attempts  to  make  use  of  a>  least  some 
of  the  data  that  are  available  to  him,  in  order  to  clarity  his  view, 
or  perhaps  to  confirm  his  assessment,  of  the  situation. 

The  following  discussion  takes  a rather  broad  view  of 
hypothesis  evaluation.  It  touches  on  a number  of  topics  that 
relate  to  man’s  abilities,  limitations,  biases  and  predilections 
as  a processor  of  information  or  a user  of  evidence.  In  some  cases 
it  may  appear  to  range  beyond  the  specific  subject  of  hypothesis 
evaluation,  and  deal  with  "thinking"  more  generally.  Our  reason 
for  including  this  material  is  that  it  seems  to  us  relevant  to  the 
problem  of  decision  making,  and  it  appears  to  fit  more  readily  here 
than  elsewhere  within  our  conceptual  framework.  in  Section  8.6, 
the  discussion  becomes  narrowly  focused  on  the  problem  of  revising 
probabilities  in  situations  that  have  been  formal i zed  to  the  extent 
that  a Bayesian  data-aygregati on  algorithm  might  he  applied. 

8.1  Seri_al  versus  Parallel  Pi'oefmun^ 

One  question  of  interest  concerning  the  way  people  evaluate 
hypotheses  is  whether  they  consider  them  one,  or  several,  at  a 
time.  Empirical  data  are  licking  on  the  question  of  which  of 
these  alternatives  best  characterizes  man's  approach  to  hypothesis 
evaluation.  It  is  our  impression  that  the  prevailing  consensus  is 
that  the  assumption  of  seriality  is  the  more  plausible  of  the  two, 
insofar  as  the  conscious  consideration  of  hypotheses  is  concerned. 

If  the  serial  model  is  the  more  nearly  correct,  this  must 
represent  a basic  limitation  of  man.  It  is  difficult  to  think  of 
a convincing  reason  why  one  should  evaluate  the  hypothecs  serially 
if  he  is  able  to  treat  them  In  parallel. 

But  even  if  we  assume  that  one  cannot  test  several  hypotheses 
at  once,  there  is  still  a question  about  the  order  in  which  testing 
is  done.  One  might  apply  an  incoming  datum  to  each  of  the 
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hypotheses  in  turn.  Alternatively,  one  might  focus  exclusively  on 
one  hypothesis  until  one  had  enough  confirming  data  to  accept  it, 
or  until  the  evidence  against  it  was  sufficient  to  Warrant  its 
rejection,  in  which  case  attention  would  be  shifted  to  another 
possibility.  Note  that  in  this  latter  case  a datum  cannot  be  dis- 
carded after  being  applied  to  the  evaluation  of  one  hypothesis 
because  it  may  be  germane  to  the  evaluation  of  others  later. 

One  putative  advantage  of  the  Bayesian  approach  {see 
Section  8.61  is  that  it  forces  the  decision  maker  to  apply  an 
incoming  datum  to  each  of  the  candidate  hypotheses  in  turn.  One 
of  the  implications  of  this  fact  is  hat  it  minimizes  the  need 
for  the  decision  maker  or  system  to  retain  data.  Assuming  that 
the  set  of  hypotheses  with  which  the  decision  maker  is  working 
is  complete,  and  will  not  be  extended,  a datum  can  be  discarded 
once  it  has  been  assimilated  and  the  probabilities  associated 
with  all  the  hypotheses  revised. 

8.2  Subconscious  Processes 

40 

What  is  happening  at  a subsconscious  level  is,  of  course, 
even  less  well-understood.  The  belief  has  been  expressed  that 
the  brain  carries  on  problem-solving  activity  even  when  one  is  not 
consciously  thinking  about  a problem.  Wallas  (1926)  elaborated 
and  popularized  the  notion,  which  he  credits  to  Helmholtz,  that 
creative  thinking  often  involves  a period  of  "incubation,"  which 
follows  a period  of  "preparation,"  and  precedes  a period  of 
"illumination."  During  the  preparation  period,  according  to  this 
view,  the  problem  solver  consciously  labors  on  the  problem,;  during 
the  illumination  period  the  problem  solver  becomes  aware  of  the 
solution  for  which  he  was  seeking.  No  conscious  attention  is 
given  to  the  problem  during  the  incubation  period,  but,  Wallace 
suggests,  much  subsconscious  exploration  of  the  problem  takes  place. 

While  the  idea  has  primarily  anecdotal  support,  the 
testimony  of  creative  thinkers  about  the  way  they  have  arrived 
at  solutions  to  difficult  problems  is  fairly  compelling  evidence 
that  something  of  this  sort  does  occur.  We  mention  it  in  this 
context  to  make  the  point  that  the  fact  (if  it  is  a fact)  that 
decision  makers  tend  to  apply  newly  acquired  data  to  the  evaluation 
of  only  one  hypothesis  at  a time,  should  probably  not  be  taken  as 
conclusive  evidence  that  the  credibility  of  a hypothesis  not  under 
consideration  has  not  been  affected  by  those  data.  Moreover,  it 
is  at  least  a plausible  conjecture  that  the  likelihood  that  any 
given  hypothesis  will"suggest  itself"  for  explicit  consideration 
may  depend  to  some  degree  on  such  subconscious  activity  (Maier, 
1931). 
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Dreyfus  (39  61)  has  argued  that,  such  subconscious,  or 
marginally  conscious,  activity  in  a general  anti  dif ficult-to- 
simulate  characteristic  of  man  as  a problem  solver.  It  is  this 
ability  that  makes  it  possible  for  him  Lo  consider  consciously 
only  the  "interesting"  moves  in  a game  of  chess  without  explicitly 
considering  all  possib  c moves  and  rejecting  those  that  are  not 
worth  pursuing.  But  subconscious  processes  are  beyond  the  scope 
of  this  report,  so  we  will  not  pursue  the  topic  further. 

8.3  Man  _A  s _A  n _In  t u i five  _ Lo  yic  i an  , 

Technically,  logic  is  the  discipline  which  deals  with  the 
rules  of  valid  inference.  The  term  is  used  colloquially,  however, 
as  a synonym  for  reasoning.  it  is  of  some  relevance  to  the 
general  problem  of  decision  making,  and  in  particular  to  the 
problem  of  training  decision  milkers,  to  consider  whethci  reasoning 
as  it  is  practiced  by  people  is  logical  in  the  technical  sense; 
and,  to  the  extent  that  it  is  illogical,  whether  it  is  illogical 
in  consistent  ways.  A further  question  of  interest  is  whether 
training  in  formal  logic  can  reasonably  be  expected  to  improve 
decision-making  performance. 

Philosophers  have  not  been  in  agreement  on  the  first 
question.  Henle  (1962)  points  out  that  some  of  the  19th  century 
writers  (e.g.,  Boole,  1854;  Kant,  1885;  Mill,  1874)  viewed  logic 
as  the  science  of  the  laws  of  thought.  Home  more  recent  writers 
(e.g.,  Cohen,  1944;  Russell,  1904;  Schiller,  1930)  have  treated 
logic  as  something  quite  independent  of  thought  processes  and  to 
reject  the  notion  that  thinking  necessarily  conforms  to  logical 
principles.*  A middie-of- the-road  view  is  that  thinking  sometimes 
conforms  to  logical  principles--especialiy  when  one's  explicit 
purpose  is  to  reason  carefully  and  deductive Ly — and  sometimes 
does  not. 


*A  cynic  might  assert  that  few  arguments  are  won  or  lost  on 
logical  grounds.  Certainly,  the  aiogicai  strategems  that  cun 
be  applied  to  arguments  are  numerous,  and  perhaps  are  better 
learned  in  the  course  of  normal  development  than  are  the  rules 
of  inference.  The  disputatious  reader  who  feels  his  arsenal 
of  such  strateqems  is  del icient  is  referred  to  Schopenhauer 
(no  date)  who  provides  a veritable  oornicopin  of  them. 
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Whether  or  not  thinking  is  logical  may  be  difficult  to 
determine  empirically  in  any  particular  case,  because  the  steps 
by  which  one  arrives  at  a conclusion  usually  are  not  available 
for  observation.  As  Mill  (1874)  points  out,  since  "the  premises 
are  seldom  formally  set  out,...  it  is  almost  always  to  a certain 
degree  optional  in  what  manner  the  suppressed  link  shall  be  filled 
up...  [A  person]  has  it  almost  always  in  his  power  to  make  his 
syllogism  good  by  introducing  a false  premise;  and  hence  it  is 
scarcely  ever  possible  decidely  to  affirm  that  any  argument  involves 
a bad  syllogism"  (p.  560;  from  Henle,  1962). 

Individuals  undoubtedly  differ  greatly  in  their  ability  to 
think  logically,  and  any  characterization  of  human  strengths  and 
weaknesses  in  this  regard  is  bound  to  be  only  partially  correct. 

Theie  are  many  ways  in  which  reasoning  can  be  illogical,  however, 
and  it  is  not  unreasonable  to  ask  whether  some  of  the  many  possible 
evidences  uf  fallibility  are  appreciably  more  common  ,.han  others. 
Several  ways  in  which  human  reasoning  does  seem  to  depart  from  the 
ranks  of  logic  have  been  discussed  'enle  (1962).  These  include: 
failure  to  distinguish  between  the  ta  -ual  truth  of  a conclusion 
and  the  logical  validity  of  the  argument  on  which  it  is  based; 
restatement  of  a premise  or  a conclusion,  which  may  have  the 
effect  of  preserving  a logically  valid  form,  while  chanqinq  the 
substance  of  the  argument;  the  omission  of  premises  from  an  argument, 
or  the  addition  of  spurious  premises.  The  fallacy  of  the  " indis- 
tributed  middle"  is  one  that  has  long  been  recognized  as 
particularly  bothersome,  and  involves  the  assignment  of  different 
meanings  to  the  same  term  when  it  appears  in  different  premises. 

Another  type  of  logical  error  that  seems  to  be  commonly 
made  involves  a misunderstanding  of  the  syllogistic  form:  "If  A 

then  B;  A;  therefore  B,"  or  "if  A then  B;  not  U;  therefore  not  A." 
These  forms  may  be  perverted  either  as  "If  A then  B;  not  A; 
therefore  not  B,"  or  "If  A then  B;  B;  therefore  A."  Both  of  tl>ese 
forms  are  invalid;  nevertheless  most  readers  will  probably  recognize 
them  as  forms  that  they  have  encountered,  and  perhaps  used,  in 
arguments . 

Wason  (1974)  describes  a failure  in  reasoning  that  he  has 
observed  that  seems  to  be  related  to  this  type  of  misunderstanding. 
Four  cards  are  placed  on  a table  so  the  subject  can  see  only  one 
side  of  each  of  them.  The  cards  contain  respectively  a vowel,  a 
consonant,  an  even  number  and  an  odd  number.  The  subject  is  told 
that  each  card  has  a letter  on  one  side  and  a number  on  the  other, 
and  is  asked  which  cards  would  have  to  be  turned  over  to  determine 
the  truth  or  falsity  of  the  statement:  "If  a card  has  a vowel  on 

one  side,  then  it  has  nn  even  number  on  the  other."  The  main . ' y of 
Wason 's  subjects  chose  either  the  card  showing  the  vowel  and  the 
one  showing  the  even  number,  or  just  the  card  showing  the  vowel. 

The  correct  answer  is:  the  card  that  shows  the  vowel  and  the  one 

that  shows  the  odd  number.  Only  by  finding  an  odd  number  behind 
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the  vowel  or  a vowel  behind  the  odd  number  would  the  statement  be 
falsified.  The  students'  choice  of  the  card  with  the  even  number 
is  a form  of  the  fallacy  known  as  asserting  the  consequent:  "If 

A then  B;  B;  therefore  A." 

This  typo  of.  rea;  >ning  error  occurs  with  sufficient  consis- 
tency (at  least  among  college  students)  to  have  prompted  investi- 
gation by  several  researchers.  A completely  satisfactory  explana- 
tion has  not  yet  been  forthcoming.  Wason  seems  to  favor  the  view 
that  the  choice  of  cards  is  made  on  an  intuitive  basis  and  that 
the  "reasons"  for  the  choice  - which  subjects  give  in  response  to 
the  experimenter's  inquiries  - are  really  rationalizations.  "This 
hypothesis  is  consistent  with  our  crude  knowledge  about  intuition. 

A verdict  may  occur  to  a judge  before  the  grounds  which  support 
it  have  been  spelled  out;  a chess  player  may  "see"  a good  move,  and 
then  analyze  the  continuations  which  validate  it.  Such  thought 
suggests  a processing  mechanism  which  operates  at  different  levels" 
(p.385). 

The  last  chapter  on  the  topic  of  the  relationship  between 
logic  and  thought  has  not  been  written.  And  it  cannot  be  until 
much  more  is  known  about  the  workings  of  the  human  mind.  The 
immediate  challenge  for  training  research  is  to  identify  ways  to 
improve  the  capability  of  individuals  to  reason  logically,  or  at 
least  to  recognize  and  be  able  to  avoid  the  more  common  illogical 
pitfalls . 

8.4  Man  as _an  Intuitive  Statistician 

It  is  quite  clear  that  most  individuals  could  manage  to  got 
through  life  without  ever  explicitly  assigning  a numerical 
probability  to  an  event.  Undoubtedly,  the  vast  majority  of  people 
do  so.  it  seems  safe  to  assume,  however,  that  people  do  make 
judgments  of  likelihoods,  and  that  these  judgments--even  though 
nonnumeric,  and  often  implicit — con  Ltion  their  behavior.  An 
individual  carries  an  umbrella  because  tie  ttiinks  there  is  a good 
chance  of  rain,  or  buys  stock  that  he  expects  to  appreciate.  One 
purchases  life  insurance  before  boarding  an  airplane  because  one, 
in_effect,  ^as  considered  the  likelihood  that  the  plane  will  go 
down  durTng  that  flight  to  be  nonnegligible;  ttic  fact  ttiat  he 
boards  the  plane  at  all  is  probably  evidence  that  he  also  considers 
that  likelihood  to  be  something  less  than  certainty.  One  chooses 
one  among  three  job  opportunities,  because  the  chances  of  success 
and  advancement  are  perceived  as  greater  in  the  case  of  the  selected 
job  than  in  that  of  the  others.  In  short,  although  most  of  us  do 
not  attempt  to  assign  numeric  probabilities  to  possible  situations 
or  events,  we  behave  as  though  our  choices  had  been  dictated  by 
reasoning  of  the  sort:  this  event  is  more  likely  than  that,  or  the 

likelihood  of  this  situation  is  great  enough  so  that  I had  better 
do  thus  and  so  in  order  to  be  prepared  if  it  should  occur. 
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A question  of  some  practical  interest,  therefore,  is  that  of 
how  effectively  such  judgments  are  made.  For  many  situations, 
there  is  no  way  to  answer  this  question  objectively.  The  individ- 
ual who  selects  one  job  from  among  three  possibilities  because  he 
considers  the  likelihood  of  success  to  be  highest  for  Lha ; case 
will  never  know  for  certain  whether  his  judgment  was  correct. 

There  arc  also,  however , many  situations  for  which  the  "objective" 
probabilities  of  events  are  known  or  can  be  determined,  and  we 
can  at  least  ask  how  well  people  do  when  asked  to  estimate 
probabilities  explicitly  in  these  cases.  The  literature  that  is 
relevant  to  this  q , KPCron  falls  fairly  naturally  into  three  cate- 
gories. First  are  the  studies  that  deal  with  people’s  ability  to 
estimate  the  statistical  properties  of  samples  that  they  are 
permitted  to  observe.  Buch  studies  concern  relative  frequencies 
rather  than  probabilities,  but  to  the  degree-  that  our  ideas  about 
probabilities  are  based  on,  or  influenced  by,  percei\  -d  frequencies 
they  are  germane.  Second  are  some  studies  that  have  to  do  with 
the  extent  to  which  people's  intuitive  notions  about  the-  probabil- 
ities of  events  correspond  to,  or  conflict  with,  the  implications 
of  the  theory  of  probability  as  represented  in  the  probability  cal- 
culus. Third  are  numerous  recent  experiments  that  consider  the 
specific  question  of  how  effectively  people  function  as  Bayesian 
data  aggregators.  In  this  section  we  will  consider  briefly  the 
first  of  these  three  categories  of  studies;  in  Sections  8.4  and  8.5 
we  will  consider  the  last  two. 

People  appear  to  be  reasonably  good  at  perceiving  proportions, 
or  the  relative  frequencies  of  occurrence,  of  both  sequential  and 
simultaneous  events  (Attneavo,  19  53;  Peterson  i.  Beach,  196  7; 

Schrenk  &.  Kanarick,  1967;  Erlich,  1964  ; Vlek  , 1970)  and  at  esti- 
mating the  means  of  number  sequences  (Beach  f»  Swcnsson,  1966; 
Edwards,  1967).  Inferences  concerning  the  median  or  mode  of  a 
skewed  distribution  (assuming  the  subject  knows  the  definitions 
of  these  terms)  are  fairly  accurate,  and  the  es( imated  mean  of  such 
distributions  tends  to  be  biased  in  the  direction  of  the  median 
(Peterson  (»  Beach,  1967).  One's  confidence  is  one's  estimate  of 
the  mean  or  the  variance  of  a population  appears  to  increase  as  the 
sample  size  increases  (Peterson  <«  Beach,  1967  ; but  see  also  Pitz, 
(1967)  . 

Estimates  of  the  variability  of  a sot  of  data  often  tend  to 
decrease  as  the  mean  increases  (Hofstatt.er,  1939;  hathrop,  1967  ; 
Peterson  £,  Beach,  1967)  . Peterson  and  Beach  ( 1967)  point  out  that 
while  the  notion  that  variability  is  necessarily  inversely  related 
to  the  mean  is  erroneous,  it  is  intuitively  compelling.  "Think 
of  the  top  of  a forest.  The  tree  tops  seem  to  form  a fairly  smooth 
surface,  considering  that  the  tree  may  be  60  or  70  feet  tall.  Now, 
look  at  your  desk  top.  Jn  all  probability  it  is  littered  with  many 
objects  and  if  a cloth  we>e  thrown  over  it  the  surface  would  seem 
very  bumpy  and  variable.  The  forest  top  is  far  more  variable  than 
the  surface  of  your  desk,  but  not  relative  to  the  sizes  of  the 
objects  being  considered"  (p.  31).  One  is  led  to  wonder  whether 
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the  findiny  that  estimated  variability  tends  to  decrease  with 
increasing  mean  miyht  be  due  in  part  to  failure  by  the  subject 
to  understand  that  it  is  an  estimate  of  absolute  variability  that 
he  is  to  produce.  Relative  variability  probably  often  does  de- 
crease as  the  mean  Increases  (to  cite  Peterson  and  Reach's  tree 
top  example) , and  without  explicit  instructions  to  the  contrary  it 
would  not  be>  unreasonable  for  a subject  to  suit  the  terms  to  the 
context,  as  one  docs  when  one  speaks  both  of  a small  skyscraper 
and  a larye  dog. 

8.5  ^ntui tyve_P rob ability  Theory 

How  closely  do  man’s  intuitions  about  probabilities  corre- 
spond to  the  implications  of  probability  theory?  The  question 
cannot  be  answered  decisively,  but  a number  of  pertinent  observa- 
tions can  be  made.  For  example,  people  often  seem  to  find  it 
difficult  to  believe  that  the  outcome  of  an  event  can  br 
independent  of  what  has  preceded  it.  This  difiiculty  is  sometimes 
manifested  in  the  "gambler's  fallacy"  (a  fallacy  that  competent 
gamblers  probably  would  not  make),  one  form  of  which  holds  that  a 
run  of  successes  increases  the  likelihood  of  a failure,  or  vice 
versa  (Cohen  & Hansel,  1956).  Another  example  ot  assumed  dependence 
among  successive  events  has  been  noted  by  Jarvi.k  (.1951),  who  found 
that  when  given  a two-alternative  prediction  task,  subjects  often 
tended  to  predict  the  more  frequent  event  after  one  occurrence  of 
the  less  frequent  event  and  to  predict  the  loss  frequent  after  two 
consecutive  occurrences  of  the  more  frequent  event. 


Several  experimenters  have  found  that  man  does  not  estimate 
the  probability  of  compound  events  very  accurately.  In  particular, 
when  assessing  the  likelihood  ol  the  joint  occurrence  of  several 
independent,  events,  he  tends  to  produce  estimates  that  are  too 
high  (Cohen,  Chesnick,  6 Reran,  1972;  Fleming,  1970;  Slovic,  1969). 
Conversely,  when  estimating  the  probability  of  disjunctive  events — 
the  probability  that  any  one  ol  several  specified  events  will  occur- 
he  tends  to  produce  estimates  that  are  too  low  (Cohen,  Chesnick, 

& Haran,  1972;  Tversky  & Hahneman,  1974).  The  ovoj estimation  of 
the  probability  of  conjunctive  events  is  consistent  with  the  ob- 
servation that  people  frequently  base  judgments  of  the  degree  of 
correlation  between  two  events  on  those  cases  in  which  the  outcomes 
of  interest  do  occur  together  without  giving  sufficient  considera- 
tion to  those  cases  in  which  they  do  not  (Peterson  4»  Beach,  1967). 


What  is  of  more  interest  than  the  fact  thal  man's  intuitions 
sometimes  lead  to  incorrect,  judgments  about  event,  probabilities  is 
the  question  of  the  extent  to  which  the  failings  of  intuition — 
at  least  insofar  as  they  are  systematic — are  explainable  in  terms 
of  identifiable  ways  in  which  such  judgments  are  made.  Ii  a 
recent  series  of  studies.  Tversky  and  Kahneman  (1971,  1973,  1974; 
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Kahneman  & Tversky,  1972,  1973)  have  explored  this  question.  The 
general  approach  in  these  studies  and  in  those  of  others  who  have 
conducted  similar  investigations  (e.y.,  Alberoni,  1962;  Tune,  1964; 
Wagenaar,  1970)  has  been  to  ask  people  to  estimate  the  probability 
of  the  occurrence  of  a hypothetical  event,  or,  perhaps  more  commonly, 
to  indicate  which  of  two  such  events  is  the  more  probable.  One 
might  be  asked,  for  example,  to  indicate  whicli  of  the  two  following 
sequences  of  coin  tosses  is  the  more  likely,  HHUHTTTT  or  HilTHTTHT; 
or  to  indicate  which  of  two  hospitals--  which  record  approximately 
15  and  45  births  a day,  respectively--would  have  the  largest 
frequency  of  days  on  which  more  than  60%  of  the  babies  born  are 
boys . 

The  results  of  these  studies  have  revealed  a number  of  ways 
in  which  the  answers  that  people  give  to  such  questions  depart 
systematically  from  the  objective  probabilities  of  tl  -■  events  as 
inferred  from  the  application  of  probability  mathematic*’.,  Tversky 
and  Kahneman  attribute  such  failures  in  judment  to  the  heuristic 
principles  that  people  often  use  when  attempting  to  estimate 
probabilities  or  relative  likelihoods. 

It  will  be  helpful,  before  considering  some  of  Tversky  and 
Kahneman' s specific  results  to  digress  briefly  to  consider  the 
notion  of  a heuristic  principle  or  procedure.  The  term  "heuristic," 
which  comes  from  the  Greek  heuriskin,  meaning  "serving  to  dis- 
cover," appears  sporadically  In  the  literature  of  philosophy  and 
logic  as  the  name  of  a branch  of  study  dealing  with  the  methods 
of  inductive  reasoning.  It  was  revived  by  Polya  (1957)  in  his 
classic  treatise  on  problem  solving,  and  used  to  connote  inductive 
and  analogical  reasoning  leading  to  plausible  conclusions,  as 
opposed  to  the  deductive  developments  of  rigorous  proofs.  In 
recent  years,  computer  scientists,  and  especially  researchers  in 
the  area  of  machine  intelligence,  have  appropriated  the  term  to 
connote  "a  rule  of  thumb,  strategy,  trick,  simplification,  or  other 
kind  of  device  which  drastically  limits  search  for  solutions  in 
large  problem  spaces"  (Feigenbaum  f*  Feldman,  196  3,  p.  6),  In  short, 
a heuristic  principle  or  procedure,  usually  referred  to  simply  as  a 
heuristic,  is  a means  of  making  an  inherently  difficult  problem  more 
tractable.  The  criterion  by  which  a heuristic  is  measured  is  its 
usefulness.  It  is  important  to  bear  in  mind,  however,  that 
heuristics  arc  not  expected  to  lead  invariably  to  correct  solutions. 
"A  'heuristic  program,'  to  be  considered  successful,  must  work  well 
on  a variety  of  problems,  and  may  often  be  excused  if  it  fails  on 
some"  (Minsky,  1963,  p.  408). 
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8.5.1  Kepre  h on  La  t i ve ne  s s 

Tversky  and  Kahnuman  describe  two  heuristic  principles-- 
representativeness  and  availabi 1 L ty — which  they  tool  account  for 
many  of  the  systematic  judmentai  biases  that  they  and  other 
investigators  have  observed.  According  to  the  representativeness 
principle,  "the  subjective  probability  o£  an  event,  or  a sample, 
is  determined  by  the  degree  to  which  it.:  (i)  is  similar  in 

essential  characteristics  to  its  parent  population;  and  (ii) 
reflects  the  salient  features  ol'  the  process  by  which  it  is  gen- 
erated" (Kahneman  h,  Tversky,  1 972,  p.  4 30).  Several  examples  ol 
the  application  ol  this  principle  are  given;  two  will  suffice  for 
our  purposes,  one  illustrating  each  of  the  subp r i nc i pies . 

The  importance  of  the  similarity  between  the  judged  event 
and  the  parent  population  is  illustrated  by  the  follow : ""i  question: 
"All  families  of  six  chi ldren  in  a city  were  surveyed.  In  72 
families  the  exact  order  of  biiths  of  boys  and  qirls  was  tlBOhbG. 

What  is  your  estimate  of  the  number  of  families  surveyed  in  which 
the  exact  order  of  births  was  BGBBBli?"  (Kahneman  (*  Tversky,  1972, 
p.  432).  If  the  probabilities  ol  male  and  female  births  were 
exactly  equal,  the  two  birth  sequences  would  be  equally  probable. 
(Apparently,  the  frequency  of  male  births  is  slightly  higher  than 
that  of  female  births,  so  the  latter  sequence  is  slightly  more 
probable  than  the  former.)  About  80?-  of  the  subjects  (high-school 
students)  who  were  asked  this  question  judged  the  latter  sequence 
to  be  less  likely  than  the  former;  the  median  estimated  number  of 
families  with  this  birth  order  was  30.  Kahneman  and  Tversky 
attributed  this  result  to  the  fact  that  the  two  birth  sequences, 
while  about  equally  likely,  are  not  equally  representative  of 
families  in  the  population.  The  former  sequence  is  more  similar 
to  a larger  proportion  of  the  population,  both  in  terms  of  the 
relative  number  of  girls  and  boys,  and  in  terms  of  the  length  of 
runs  of  births  of  the  same  sex. 

The  second  way  in  which  the  representativeness  heuristic 
manifests  itself--in  sensitivity  to  the  degree  to  which  .in  event 
reflects  the  salient  features  of  the  process  that  generated  it-- 
is  illustrated  by  the  tendency  ot  p.ople  to  consider  regularities 
in  small  samples  to  be  inconsistent  with  the  assumption  that  such 
samples  were  generated  by  a random  process.  Thus,  when  people  are 
asked  to  produce  random  sequences  such  as  the  results  of  an  imagined 
series  of  coin  tosses,  they  tend  to  produce  fewer  long  runs  than 
would  a truly  random  process.  Moreover,  in  judging  the  randomness 
of  small  samples,  they  are  likely  to  reject  as  nonrandom  many  of 
the  samples  that  a random  process  does  generate.  Kahneman  and 
Tversky  characterize  tire  intuition  tlrat  produces  such  judgmental 
biases  as  a belief  tlrat  a representative  sample  .should  represent 
tlie  essential  characteristics  of  the  parent  population,  not.  only 
globally,  but  locally  in  each  of  its  parts.  Jn  other  words  the 
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observed  behavior  is  consistent  with  the  belief  that  the  law  of 
large  numbers  applies  to  small  numbers  as  well  (Tversky  4,  Kahneraan, 
1971). 


The  application  of  this  heuristic  could  lead  one  to  the  sort 
of  fallacious  thinkinq  illustrated  by  the  conclusion  that  the 
probability  of  finding  more  than  600  boys  in  a random  sample  of  1000 
children  is  the  same  as  that  of  finding  more  than  60  boys  in  a random 
sample  of  100  children.  The  probability  of  the  latter  event  is, 
of  course,  much  greater  than  that  of  the  former,  Kahneman  and 
Tversky  (1972)  showed  that  people  (at  least  high-school  students)  ■ 
do  virtually  ignore  the  effect  of  sample  size  when  estimating  the 
probabilities  of  random  events  of  this  sort.  In  general,  the 
estimates  made  by  Kahneman  and  Tversky' s subjects,  when  asked  to 
judge  the  probability  of  events  that  have  a binomial  distribution, 
were  much  more  appropriate  for  small  samples  (e.g.,  , ) than  for 
large  samples  (e.g.,  100  or  1000).  In  other  words,  foi  large  samples 
subjects  tended  to  underestimate  grossly  the  probability  of  high- 
probabilrty  events  and  overestimate  the  probability  of  low-proba- 
bilrty  events,  and  the  magnitude  of  the  miss  increased  with  the 
size  of  the  sample. 

8.5.2  Availability 

The  availability  principle,  according  to  Tversky  and  Kahneman 
(1973)  is  used  whenever  one  bases  estimates  of  frequency  or  prob- 
ability on  the  ease  with  which  instances  or  associations  are 
called  Lo  mind.  For  example,  when  asked  to  estimate  the  relative 
likelihoods  of  heart  attacks  for  men  aid  women,  one  might  think  of 
male  and  female  victims  of  heart  attack  among  one's  personal  acquain- 
tances and  take  the  ratio  as  an  estimate  of  the  relative  likelihoods 
in  the  population.  Or,  if  asked  to  judge  which  of  two  Letters  occurs 
the  more  frequently  as  the  first  letter  of  English  words,  one  might 
attempt  to  think  of  a few  words  of  each  class  and  make  the  judgment 
on  the  basis  of  the  rapidity  with  which  examples  come  to  mind. 

Tversky  and  Kahneman  point  out  that  "availability"  i K -i^ 
ecologically  valid  cue  for  the  judgment  of  frequency  because,  in 
general,  more  frequent  events  are  easier  to  recall  or  imagine  than 
infrequent  on  s.  However,  availability  is  also  affected  by  various 
factors  whicli  ire  unrelated  to  actual  frequency.  If  the  availability 
heuristic  is  applied,  then  such  factors  will  affect  the  perceived 
frequency  of  classes  and  the  subjective  probability  of  events. 
Consequently-  the  use  of  the  availability  heuristic  leads  to 
systematic  biases"  (1973,  p.  209). 

As  one  example  of  how  application  of  the  availability  heuristic 
can  lead  to  an  erroneous  judgment,  Tversky  and  Kahneman  report  the 
following  experiment.  Subjects  were  asked  to  estimate  the  number  of 
different  remember  committees  that  can  be  formed  from  a group  of  10 


63 


NAVTRAEQUIPCEN  73-C-0128-1 


people.  The  estimates  tended  to  decrease  with  increasing  r fur 
values  of  r between  2 and  8.  In  particular,  subjects  typically 
judged  it  to  be  possible  to  form  many  more  committees  of  si  20  2 
than  of  size  8,  when  in  fact  the  same  number  is  possible  in  both 
cases.  (Similar  results  were  obtained  when  subjects  were  asked 
to  estimate  the  number  of  different  patterns  of  r stops  that  a bus 
could  make  while  traversing  a route  with  10  stations  between  start 
and  finish.)  The  explanation  for  this  result,  according  to  Tversky 
and  Kalinemun,  lies  in  the  fact  that  committees  of  two  members  are 
more  readily  imagined  than  those  of  eight,  and,  consequently,  appear 
to  be  more  numerous. 

The  major  difference  between  the  heuristic  principles  of 
representativeness  and  availability,  Kahncman  and  Tversky  suggest, 
is  in  the  nature  of  the  judgments  on  which  the  subjective  prob- 
ability estimates  are  based.  "According  to  the  representativeness 
heuristic , one  evaluates  subjective  probability  by  the  degree  of 
correspondence  be  1 ween  the  sample  and  the  population,  or  between 
an  occurrence  and  a model.  This  heuristic,  thereioro,  emphasizes 
the  generic  features,  or  the  connotation,  of  the  event.  According 
to  the  availability  heuristic,  on  the  other  hand,  subjective 
probability  is  evaluated  by  the  difficulty  of  retrieval  and  con- 
struction of  instances.  It  focuses,  therefore,  on  the  parti cular 
instances,  or  the  ,-deno  tation , of  the  event.  Thus,  the  represen- 
tativeness heuristic  is  more  likely  to  be  employed  when  events  are 
characterized  in  terms  of  their  general  properties;  whereas,  the 
availability  heuristic  is  more  likely  to  be  employed  when  events 
are  more  naturally  thought  of  in  terms  of  specific  occurrences " 
(Kahneman  s.  Tversky,  1972  , p.  452).  A feature  common  to  both 
heuristics  is  their  reliance  on  ment.il  effort  as  an  indicant  of 
subjective  probability.  "It  is  certainly  harder  to  imagine  an 
uncertain  process  yielding  a nonrepresentative  outcome  than  to 
imagine  the  same  process  yielding  a highly  representative  outcome. 
Similarly,  the  less  available  the  instances  of  an  event,  t lie  harder 
it  is  to  retrieve  and  construct  them"  (ibid,  p,  452). 

8.5.3  A Methodological  Consideration 

There  is  a methodological  consideration  relating  to 
some  of  the  findings  of  judgmental  biases  that  deserves  more 
attention  than  it  lias  received.  This  has  to  do  with  the  possible 
role  of  language  ambiguities.  We  have  already  alluded  more  than  once 
to  the  well  known  fact  that  the  meaning  of  language  is  conditioned 
by  the  situation  in  which  it  occurs.  To  borrow  an  example  from 
Dreyfus  (1961),  "a  phrase  like  'stay  near  me'  can  mean  anything  from 
'press  up  against  me'  to  'stand  one  mile  away, ' depending  upon 
whether  it  is  addressed  to  a child  in  a crowd  or  a fellow  astronaut 
exploring  the  moon"  (p.  20).  Although  it  seems  unlikely  that  many 
of  the  results  that  have  been  mentioned  above  can  be  attributed  to 
the  imprecision  of  language,  the  possibility  that  some  of  them 
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may  be  based,  at  least  in  part,  on  this  factor  should  not  be 
overlooked.  The  finding  that  the  estimated  variability  of  a set 
of  data  tends  to  decrease  as  the  mean  increases  was  mentioned  in 
a preceding  section  as  one  possible  case  in  point.  Tversky  and 
Kahneman's  finding  that  people  judge  it  to  be  possible  to  form  a 
larger  number  of  different  2-man  committees  than  8-man  committees 
from  a pool  of  10  men  may  be  another.  There  is  a way  of  defining 
"different"  (e.g.,  "having  no  people  in  common")  such  that  the 
judgment  would  b valid,  and  before  one  can  take  the  results  as 
evidence  of  fauliy  intuitions  concerning  combinatorics,  one  must 
be  certain  that  none  of  the  subjects  is  using  such  a definition. 
Our  guess  is  that  language  ambiguities  will  not  go  far  toward 
explaining  the  results  obtained  by  Tversky  and  Kahneman,  but  it 
seems  conceivable  that  they  may  have  played  some  role,  and  some 
further  research  might  be  directed  toward  determining  the  extent 
of  that  role. 

8.5.4  Training  and  Intuitive  Probability  Theory 

We  have  reviewed  these  results  at  some  length  because  this 
general  line  of  research  strikes  us  as  being  not  only  exceptionally 
interesting  from  a theoretical  point  of  view,  but  of  considerable 
prar  H cal.  significance.  To  the  extent  that  the  heuristics  that 
hav  'en  identified  are  representative  of  the  ways  in  which  people 
generally  make  judgments  of  likelihood,  it  is  clearly  important  to 
determine  those  conditions  under  which  they  lead  to  erroneous 
judgments  and  those  under  which  they  do  not.  Tversky  and  Kahneman 
have  demonstrated  that  there  are  at  least  some  situations  in  which 
judgments,  that  are  presumably  based  on  identifiable  heuristics, 
err  in  systematic  ways.  This  does  not,  of  course,  establish  that 
these  heuristics  are,  on  balance,  bad,  as  they  are  careful  to 
point  out.  What  one  would  like  to  know  is  the  relative  frequency 
with  which  they  lead  to  erroneous  decisions  in  practical  real-life 
situations.  From  the  point  of  view  of  the  training  of  decision 
makers  the  question  is  how  to  foster  the  use  of  such  heuristics  in 
situations  in  which  they  are  most  likely  to  be  effective,  while 
discouraging  their  use  in  situations  in  which  they  are  likely  to 
lead  to  erroneous  judgments.  Perhaps  at  least  a small  step  in 
that  direction  would  be  to  make  decision  makers  explicitly  aware 
of  the  nature  of  the  heuristics  that  tend  to  be  used  in  estimating 
probabilities,  and  of  the  types  of  erroneous  decisions  to  which 
they  can  sometimes  lead. 
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8 . 6 Bayesian  Inference 

Undoubtedly  the  most  widely  advocated  formal  approach  to  the 
application  of  incoming  data  to  the  evaluation  of  hypotheses  is 
the  "Bayesian"  approach.  Because  it  has  attracted  so  much  atten- 
tion and  has  been  the  focus  of  so  much  research,  we  will  consider 
it  in  some  detail. 

8.6.1  Bayes  Rule 

It  is  necessary  to  begin  with  a set  of  mutually  exclusive 
and  exhaustive  hypotheses,  , concerning  the  state  of  the  world. 

To  each  of  these  hypotheses  one  must  assign  a probability,  p(Ih), 
that  that  hypothesis  is  true.  Because  these  hypotheses  are,  by 
definition,  mutually  exclusive  and  exhaustive,  it  follows  that  the 
a priori  probabilities  sum  to  one,  i.e., 

Ip  (lh)  = 1 . (1) 

Inasmuch  as  the  hypotheses  that  one  is  considering  are  likely 
to  have  different  implications  concerning  what  might  be  observed 
under  specified  conditions,  it  seems  intuitively  reasonable  that 
one  should  be  able  to  increase  one's  degree  of  certainty  concern- 
ing the  truth  or  falsity  of  any  given  hypothesis  by  making  appro- 
priate observations.  For  example,  if  It  implies  D,  and  if  D is 
observed,  then  the  credibility  of  II.  might  reasonably  be  expected 
to  be  increased.  (The  truth  of  II.  Is  not  proved  by  such  an  obser- 
vation, of  course,  inasmuch  as  it^does  not  follow  from  the  tact 
that  fi . implies  D that  D implies  ; as  was  pointed  out  in  Sec- 
tion 8.3,  inferring  the  truth  of  HI  from  the  observation  of  D 
would  involve  the  logical  fallacy  known  as  " asserting  the  con- 
sequent.") If  both  H.  and  H.  could  lead  to  L>,  but  the  likelihood 
of  D given  H-  is  greater  tharl  its  likelihood  given  11.,  then  our 
intuitive  notions  about,  evidence  suggest  that  the  observation  of 
D should  increase  our  confidence  in  11^  some that  more  than  our 
confidence  in  H^.  These  notions  were  expressed  formally  by  the 
18th  Century  British  minister,  Thomas  Bayes,  in  the  so-called 
"inverse  probability  theorem" — a theorem  or  rule  that  has  been 
the  subject  of  much  debate. 

Bayes  rule  expresses  p(H. |D) , the  probability  that  is  true 
given  the  observation,  or  dat&m,  D,  as  a function  of  p ( D | li . ) , the 
probability  that  D will  be  observed  given  is  true,  and  p(Hj), 
the  probability  that  II p is  true  as  determined  prior  to  the  obser- 
vation of  D.  The  probability  of  an  observation  given  a hypothesis, 
p (D | II ) is  usually  referred  to  as  a conditional  probability;  the 
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probability  of  a hypothesis  given  an  observation,  p(H|D),  is  usually 
called  a posterior  probability.  Bayes  rule  defines  a procedure  for 
using  the  fact  that  D has  been  observed,  to  adjust  one's  estimate  of 
the  probability  that  Hi  is  true.  The  rule  may  be  written  as 


’ p(D]Hi)  P(Hi) 

p(H.|D)  = , (2) 

E p(D|H.)  p(H.) 
j = l 3 3 


where  n is  the  total  number  of  hypotheses  in  the  set.  Because 
Zp ( D | H . ) p ( H . ) = P(D)  , equation  (2)  may  be  simplified  to 

T ^ ^ 


P(Hi|D) 


p(D|Hi)  p{Hi) 
P (D ) 


(3) 


When  a sequence  of  observations  is  made,  the  rule  is  applied 
recursively,  and  the  value  of  p(HjjD)  that  is  computed  as  the 
result  of  one  observation  becomes  the  p(Hi)  for  the  following 
computation.  That  is  to  say,  the  posterior  probabilities  result- 
ing from  one  observation  become  the  prior  probabilities  for  the 
next  one.  Thus,  equation  (3)  may  be  written  more  appropriately  as: 


P^HilD)  = 


p(D|H.)  P^U^iD) 


(4] 


Pn-1(D) 


where  Pn ( 11  i I D ) represents  p(Hi|D)  after  the  1 observation,  and 
Po(IIi|D)  or,  more  appropriately,  p0(H^),  is  understood  to  be  the 
probability  of  Hf  before  any  observations  are  made.  We  will  follow 
the  convention  of  using  subscripts  only  when  they  are  essential 
for  clarity. 


Bayes  rule  states,  in  effect,  that  i_f  the  prior  probability 
of  .i  hypothesis  being  true,  p(H^)  and  the  probability  of  observing 
a particular  datum  given  that  hypotEesis  is  true,  p(D|lI^)  are  known 
for  all  i,  then  the  probability  that  the  hypothesis  is  true  given 
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that  the  datum  has  been  observed,  p(Hy|D),  can  be  calculated  in 
a straightforward  way.  In  many  decision  situations,  p(H^)  and 
p ( D | H j)  are  not  known,  and  cannot  be  determined  objectively; 
therefore,  they  must  be  estimated.  The  significance  of  the  rule 
stems  from  the  assumption,  for  which  there  is  some  evidence  that 
will  be  considered  later,  that  people  are  better  at  estimating 
conditional  probabilities,  p(l)jll),  than  at  estimating  posterior 
probabilities,  p(H|D).  Obviously,  if  they  were  invariably  very 
good  at  estimating  p(Ii|D)  there  would  be  no  need  to  make  use  of 
Bayes  rule  to  calculate  this  value;  it  would  suffice  to  have  the 
decision  maker  estimate  it  directly. 

8.6.2  Likelihood  Ratio 

In  order  to  make  use  of  Bayes  rule  it  is  not  necessary  to 
require  ,th.at  an  individual  estimate  probabilities  expl'citly.  An 
alternative  procedure  is  to  have  him  judge  the  ratios  of  pairs  of 
conditional  probabilities.  Such  ratios  are  referred  to  as 
likelihood  ratios.  The  likelihood  ratio  of  D given  relative 
to  D given  H2  may  be  expressed  as  follows: 

pUllHj ) 


p(d|h2) 

The  attractiveness  of  likelihood  ratio  stems  from  the  fact 
that  people  often  find  it  easier  to  make  the  implied  judyment 
than  to  estimate  conditional  probabilities  directly.  The  type  of 
judgment  that  is  required  in  this  case  is  of  the  sort  "Event  D 
is  X times  as  likely  if  H is  true  than  if  H is  true."  Neither 
of  the  conditional  probabilities  need  be  specified  explicitly. 

A disadvantage  associated  with  its  use  is  the  f act  that  a great, 
many  more  judgments  are  required  with  respect  to  each  observation. 

8.6.3  Other  Methods  for  Obtaining  Probability  Estimates 

Other  methods  have  been  used  to  obtain  probability  estimates 
without  having  the  subject  explicitly  produce  numerical  values. 

For  chips- in-urn  problems,  for  example,  Peterson  and  Phillips 
(1966)  have  had  subjects  adjust  markers  on  a scaled  0-to-l  con- 
tinuum so  that  each  interval  is  equally  likely  to  contain  the 
true  proportion  of  chips  of  a specified  color.  Organist  (1964) 
developed  a simple  answer  chart  which  forced  a subject  to  make 
his  distribution  of  probabilities  over  the  possible  hypotheses 
sum  to  one  and  also  specified  what  his  payoff  would  be  for  each 
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hypothesis  if  it  were  correct,  given  the  probability  that  he 
attached  to  it.  Shuford  (1967)  describes  a computer-controlled 
system  which  presents  a subject  with  a set  of  hypotheses  ana 
allows  him  to  specify  probabilities  by  adjusting  the  lengths  of 
lines  associated  with  the  hypotheses  by  pointing  at  them  with  a 
light  pen.  When  one  line  is  lengthened  or  shortened,  compensatory 
adjustments  are  automatically  made  in  the  remaining  lines  so  that 
the  probabilities  always  sum  to  one.  This  system  also  provides  the 
user  with  information  concerning  the  implications  of  his  probability 
assignments  vis-a-vis  his  payoff,  given  the  truth  of  any 
particular  hypothesis. 

8.6.4  D.iagnosticity  of  Data 

Intuition  suggests  that  the  more  disparate  the  implications 
of  two  hypotheses , the  more  informative  data  should  b concerning 
which  of  the  hypotheses  is  likely  to  be  true.  In  a Bay.,  sian 
context  the  informativeness,  or  "diagnosticity, " of  data  is  defined 
in  terms  of  the  likelihood  ratio.  Specifically,  the  magnitude  of 
a likelihood  ratio  is  said  to  represent  the  diagnosticity  of  a 
datum  with  respect  to  the  two  particular  hypotheses  involved.  The 
more  the  ratio  differs  from  1:1,  in  either  direction,  the  more 
informative  the  datum  is  with  respect  to  which  of  the  hypotheses 
under  consideration  is  correct,  and  the  more  the  distribution  of 
probabilities  over  these  hypotheses  will  change  as  a consequence. 

8.6.5  Odds 

The  ratio  of  two  posterior  probabilities  is  referred  to  as 
the  posterior  "odds"  with  respect  to  the  associated  hypotheses. 

The  posterior  odds  of  H ^ with  respect  to  H2  may  be  expressed  as 


PnIHllD> 

P„(H2|D) 


or,  equivalently,  as 


P(D|H1)  Pn_1(H1|D) 

/ 

p(D|H2)  Pn_1(H2|D) 

P(D)  _ 

/ 

P (D) 

as 

>n(H1|D)  P(D|IJ1) 

pn-l(HJD> 

•n(H2|D)  P(D|H2) 

(6) 


(7) 
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which  is  to  say  that  the  posterior  odds  is  simply  the  prior  odds 
multiplied  by  the  likelihood  ratio.  Letting  un;i,j  represent  the 
odds  of  H with  respect  to  after  the  n^-h  observation,  we  may 
express  the  relationship  as  follows; 


.»  . . 

n;i,  j 


L. 

1 


.1-!  ... 

1 n-1 ; i , j 


Obviously 

Si . 
l/i 


(8) 

(9) 


and 


L . . 

1/1 


Often  it  is  clear  from  the  context  which  of  the  two  terms  of 
either  an  odds  ratio  or  a likelihood  ratio  is  to  be  the  numerator 
and  which  the  denominator,  so  the  subscripts  are  omitted  and  the 
expression  is  written  more  simply  as 


u 


n 


LSi 


n-1  ‘ 


UH 


It  is  essential,  however,  that  the  same  hypothesis,  whether  Ii  . or  11  ■ , 
be  represented  in  the  same  position  (numerator  or  denominatort  in  J 
both  ratios. 


8.6.6  Applications  of  Bayes  rule  in  The  Two-Hypothesis  Case. 

To  summarize  what  has  been  said  so  far,  Bayes  rule  represents 
a procedure  for  evaluating  hypotheses  in  situations  that  have  the 
following  characteristics:  (a)  the  possible  states  of  the  world 
can  be  explicitly  represented  by  an  exhaustive,  and  mutually 
exclusive  set  of  possibilities;  (b)  discrete  observations  may  be 
made  in  an  effort  to  find  mure  information  about  the  actual  state 
of  the  world;  and  (c)  for  the  data  obtained  from  each  observation, 
it  must  be  reasonable  to  assign  a number  that  represents  the 
probability  that  those  data  would  have  been  obtained,  given  the 
truth  of  any  specific  one  of  the  hypothus i zed  states  of  the  world. 
In  order  to  got  an  appreciation  of  how  Bayes  rule  extracts 
information  from  data,  it  will  be  helpful  to  consider  some  concrete 
examples  of  decision  tasks  to  which  the  rule  might  be  , lplicd.  We 
will  focus  first  on  the  simple  case  in  which  the  hypothesis  set 
contains  only  two  alternatives. 
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Imagine  an  urn  containing  red  and  black  chips.  Suppose  two 
hypotheses,  and  1I„,  are  stated,  one  and  only  one  of  which  is 
true,  concerning  what  proportion  of  the  chips  in  the  urn  are  red. 
The  task  is  to  decide  which  of  these  hypotheses  is  the  true  one. 
Data  are  obtained  by  sampling  chips  one  at  a time,  replacing  each 
chip  after  it  is  examined.  Assume  that  the  chips  are  thoroughly 
mixed  before  each  observation  and  that  the  probability  of  drawing 
a red  chip  on  a trial  is  exactly  R/R+B,  where  R is  the  number  of 
red  chips,  and  B the  number  of  black  chips,  in  the  urn. 

Suppose  the  first  hypothesis,  H, , is  that  70%  of  the  chips 
are  red,  and  that  the  second  hypothesis,  I^,  is  that  20%  of  the 
chips  are  red.  Suppose  further  that  the  prior  probabilities  are 
equal,  that  is,  p^H^)  = pp(H2)  = .5.  Figure  1 shows  how 

p(H  |D)  and  p(H2|D)  change  as  a result  of  applying  Bayes  rule  to 

the  data  obtained  in  the  following  ten  successive  observations: 
RRB3RRBRRR.  Figures  2 and  3 show  the  odds,  and  the  uncertainty, 
in  the  information  theoretic  sense  of  the  word,  change  from  ob- 
servation to  observation.  Uncertainty  is,  of  course,  a monotone 
but  nonlinear  function  of  the  difference  between  the  probabilities 
associated  with  the  two  hypotheses. 


1 ote  that  the  effect  of  an  observation  is  not  necessarily  to 
decrease  the  amount  of  uncertainty  concerning  which  hypothesis  is 
true.  If  the  distribution  of 
uncertainty  is  very  likely  to 
data  before  it  decreases.  Even  if  the  distribution  of  p (H-) 
favors  the  correct  hypothesis,  or  weights  both  hypotheses  equally, 
uncertainty  may  increase  on  individual  trials.  In  this  case,  how- 
ever, it  will  decrease  on  the  average,  assuming  unbiased  sampling. 


p (IF)  favors  the  incorrect  hypothesis, 
increase  as  a result  of  observing 


Another  interesting  and  perhaps  counterintuitive  observation 
concerning  figure  1 is  the  very  large  effect  that  the  one  or  two 
initial  observations  can  have  . In  our  example,  the  initial 
draw; ig  of  two  successive  reds  had  the  result  of  making  one  of  the 
(initially  equally  likely)  hypotheses  ovm  twelve  times  more  likely 
than  the  other. 

Intuitively,  one  would  expect,  that  the  degree  of  confidence 
that  one  should  have  that  the  proportion  of  reds  and  blacks  in  one's 
sample  reflects  the  true  proportion  in  the  population  should  depend 
on  the  sample  size.  That  the  application  of  Bayes  theorem  does  not 
violate  this  intuition  may  be  seen  by  comparing  the  probability 
distribution  after  the  third  observation  and  after  the  sixth  obser- 
vation (figure  1).  Tri  both  oases,  red  chips  have  comprised  G7  percent 
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DATUM 

1 00  1 00  67  50  60  67  59  62  67  70 

PERCENT  OF  REDS  IN  SAMPLE 

H : 70%  Red;  »2 : 20%  Red; 

p0(Hi>  - V‘V  -5 

Figure  1.  Changes  in  posterior  probabilities, 

? | D)  and  pUI^jD)  s a result  of 

the  indicated  observations  of  Red  and 
Black  chips 
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DATUM 

1 00  1 00  67  50  60  67  59  62  67  70 

PERCENT  OF  REDS  IN  SAMPLE 

H 1 : 70%  Red;  H7:  20%  Red; 

P0  1HX)  = Pq  ' 5 

Figure  2.  Changes  in  odds,  2 as  a result  of  the 

indicated  observations  of  Red  and  Black  chips 
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4 5 6 7 

OBSERVATION  NUMBER 

RRBBRRBRRR 

DATUM 

1 00  00  67  50  60  67  59  62  67  70 

PERCENT  OF  REDS  IN  SAMPLE 


H1 : 70%  Red;  H2 : 20%  Red ; 


Figure  3.  Changes  in  uncertainty  concerning  hypotheses 
as  a result  of  the  indicated  observations  of 
Red  and  Black  chips 
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o£  the-  sample;  however,  the  uncertainty  is  less  following  the  sixth 
observation  than  followinq  the  third,  rcflcctinq  the  fact  that  the 
sample  size  was  larger  in  the  former  case. 


Figure  4 shows  how  the  probabilities  change  over  the  course 
of  ten  observations  in  which  reds  and  blacks  occur  with  the  same 
frequency  but  in  a different  order.  In  particular,  the  first  two 
observations  in  this  case  produced  blacks,  and  the  second  two,  reds. 
Observations  5 through  10  are  assumed  to  be  the  same  as  in  the 
original  example.  Note  that  by  the  end  of  the  fourth  trial,  the 
proportion  of  red  and  black  draws  was  the  same  in  both  examples; 
consequently,  the  probability  distributions  are  the  same  at  this 
point  and  thereafter.  This  illustrates  the  fact  that  the  Bayesian 
calculation  of  p(il.|D)  is  path-independent,  in  the  sense  that  the 
effect  of  an  observation  is  strictly  dependent  on  the  current  value 
of  p(H),  and  independent  of  the  particular  sequence  of  observations 
on  whiAh  that  value  is  based.  The  calculation  is  also  independent, 
of  the  number  of  observations  on  which  the  current  value  of  p(H.) 
is  based.  Note  that  this  point  is  different  from  the  one  made  Above 
concerning  the  effect  of  sample  size  on  uncertainty.  The  point 
that  was  made  above  was  that  the  probability  that  a given  proportion 
of  reds  in  one's  sample  accurately  reflects  the  proportion  in  the 
population  increases  with  sample  size.  The  uoint  here  is  that  the 
effect  that  an  observation  will  have  is  independent  of  how 
p(H^)  got  to  be  wnatever  it  is. 


Figures  5,  6 and  7 illustrate  the  effects  of  setting 
the  initial  values  of  p(H.)  and  p (II  ) to  something  other  than  .5. 

The  sequence  of  draws  is  Identical  to  that  in  figure  1,  and  con- 
sistent with  what  might  be  expected  if  the  true  hypothesis  were  . 
In  each  figure,  one  curve  shows  the  effect  of  these  observations 
given  that  Pq = .8;  another  shows  the  effect  given  that  p (H^) 

- .2,  and  the  tnird  represents  p ( II  ) = . r>.  The  main  thing  to 
notice  is  that  the  effect  of  an  Initial  incorrect  bias  is  largely 
nulled  out  by  relatively  few  observations.  This  point  is  freguontly 
made  by  proponents  of  Bayesian  information-processing  systems  in 
response  to  the  observation  that  a priori  probabilities  are  some- 
times dilficult  to  assign  on  anything  other  than  an  arbitrary  basis. 
A fact  that  usually  is  not  pointed  out  is  illustrated  in  figure.  7: 
changing  the  distribution  of  a priori  probabilities  shifts  the 
function  relatina  Ina  odds  to  data  by  a constant. 


8.6.7  Expected  Elfects  of  Observations  on  Hypotheses 


In  the  foregoing  examples  of  the  application  of  Bayes  rule,  we 
have  considered  how  probabilities  may  change  as  a result  of  a 
sequence  of  speci f i c observations.  It  has  been  apparent  From  these 
examples  that  the  effect  of  an  observation  sometimes  is  to  increase 
the  probability  associated  with  the  true  hypoLbesis  and  sometimes  to 
decrease  it.  On  the  average,  however,  we  expect  the  probability 


7 b 


PROBABILITY 


3 4 5 6 7 Q 9 

OBSERVATION  NUMBER 

R R R R B R R 

DATUM 

25  50  60  67  59  62  67 

PERCENT  OF  REDS  IN  SAMPLE 


: 70%  Red;  20%  «eu; 

W " Po“V  - -5 


Changes  in  posterior  probabilities,  p(H^|D) 

and  p(H2|D)  as  a result  of  the  indicated 

observations  of  Red  and  Black  chips 

(Note  that  the  results  of  the  observations 
are  the  same  as  in  figure  1 except  that  the 
first  four  produce  a different  ordering  of 
Reds  and  Blacks.) 
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O £<H,)-.00 
□ &<H,)-.50 
A B,(H,)-.20 


— -Il , i- — . 

1 2 

3 4 5 6 7 8 9 

1 1 

10  11 

OBSERVATION  NUMBER 

R R 

B B R R R R P 

DATUM 

R 

100  100, 

67  5C  60  67  69  62  67 

PERCENT  OF  REDS  IN  SAMPLE 

70 

1^:  70%  Red  ,j  »2 : 20%  Red 

Figure  6 . 

.Effects  of  indicated  observations 
Q,  7 for  different  initial  values 

on  odds, 
of  p(H  ) 
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associated  with  the  true  hypothesis  to  increase  with  each  observa- 
tion, and  that  associated  with  a false  hypothesis  to  decrease,  as- 
suming an  unbiased  sampling  of  the  data.  We  turn  now  to  a con- 
sideration of  how  p(HjjD)  can  be  expected  to  change,  on  the  average, 
as  a result  of  applying  Bayes  rule,  if  chipis  are  drawn  from  an  urn 
that  contains  reds  and  blacks  in  the  proportion  indicated  by  a 
specified  hypothesis,  say  H y 

It  will  facilitate  the  discussion  to  begin  by  considering  all 
possible  outcomes  of  a specific  experiment.  Figure  8 shows  all 
possible  effects  of  four  observations  on  p (H I D)  , in  the  cane  of 
our  example  of  H. : 70%  R,  1 20%  R,  and  p(Hh)  = .5.  Each  node 
in  the  graph  represents  one  possible  value  of  p(Hp|D)  after  the 
number  of  observations  indicated  on  the  abscissa;  no  values  are 
possible  other  than  those  represented  by  nodes.  (By  rotating  the 
graph  in  figure  8 about  a horizontal  axis  passing  through  the  .5 
point  on  the  ordinate,  one  would  produce  the  graph  of  pp^lD); 
which  is  to  say, each  of  the  points  in  the  graph  of  p(II2iD)  is  the 
complement  of  a point  in  the  graph  of  p(Hj|D).)  In  general,  after 
N observations,  p(HjjD)  will  have  one  of  N+l  possible  values.  After 
two  observations,  for  example,  p(Hi|D)  will  have  one  of  the  three  values 
.925,  .568,  or  .123.  The  number  above  each  node  indicates  the 
number  of  ways  to  arrive  at  that  node.  There  are  three  ways,  for 
example,  to  arrive  at  the  node  at  p(H^'D)  = .821,  N - 3;  RRB,  RBR 
and  BRR.  The  set  of  numbers  associated  with  a given  value  of  N 
will  be  recognized  as  the  coefficients  of  the  terms  of  the  expansion 
of  (a+b)  , the  so-called  "binomial  coefficients."  In  our  appli- 

cation, each  of  these  coefficients,  which  may  be  written  as  |m|* 
represents  the  number  of  ways  that  N events  can  be  composed 
of  m events  of  one  type  and  N-m  of  another . The  events  of  interest 
in  our  case  are  draws  of  chips  from  an  urn,  and  the  two  types 
are  draws  of  red  and  black  chips,  respectively.  The  sum  of  these 
coefficients  for  given  N, 


represents  the  number  of  uniquely  ordered  sequences  of  reds  and 
blacks  that  can  result  from  N draws.  Inasmuch  as  the  effect  of 
applying  Bayes  rule  to  a sequence  of  data  is  insensitive  to  the 
order  in  which  the  data  are  considered,  it  is  convenient  to  think 
of  all  sequences  having  the  same  combination  of  reds  and  blacks 
as  the  same  outcome,  irrespective  of  the  order  in  which  the  reds 
and  blacks  have  occurred.  Thus, the  effective  number  of  possible 
outcomes  of  N draws  is  N+l  rather  than  2 . 

Figure  8 shows  the  graph  of  possible  outcomes  for  our 
hypothetical  experiment  as  they  pertain  To  p{ll^|n).  By  the 
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algebra  of  expectation,  the  expect  i>d,  <>i  mean  value  or  p ( n , i>  > 
after  N draws  is  the  weiqhted  sun.  of  ill  possible  values,  Car h 
value  beinq  weighted  by  its  probability  o*  ore ..  r i cneu . In  terms 
of  figure  K,  the  expected  value  ot  pi'rij  D)  alter  N draws  may  be 
found  by  multiplying  the  product  at  the  value  of  each  node  above 
N on  the  abscissa  by  the  probability  nt  arriving  it  that  node, 
and  summing  over  these  products. 

Suppose  that  the  probuL ; ! i f y * f,  1 1 a:,  -bsorv.it  ion  will  yield 
a red  chip  (solid  lines)  is  ir..:  :ne  pt otabi i ity  that  it  will 
yield  a black  one  (dotted  lira;  . : 1-  The  prof  an  1 1 1 1 y of 
arriving  at  a given  node*  in  t graph,  via  r par  tic  if  ir  pith,  is 
the  product  ot  the  probabilities  associated  with  t he  links  in  that 
path.  The  probability  of  arrivir.  j at  a given  nude . irrespective 
of  the  path,  is  the  sum  ot  the  :>ro;ululities  ass  -ciated  with  all 
possible  paths  to  that  node.  But  every  path  leading  t a tommon 
node  has  exactly  the  same  probability  ot  beini  traverse  , because 
each  is  composed  of  the  same  combination  of  R and  B links.  ju, 
the  easy  way  to  calculate  the  probability  of  arriving  at  a node 
is  to  take  the  product  of  the  probability  of  tiaversing  any  spe- 
cific path  to  that  node  and  the  number  of  paths  leading  t.o  that  node. 
Figure  9 shows  expressions  for  these  probabilities  for  each 
of  the  nodes  in  our  sample  graph.  In  general,  the  probability  of 
arriving  at  a given  node  via  a specific  path  composed  of  m R links 
and  N-m  B links,  is  gj  -on  by 

.N-m 

q (l-q) 


and  the  probability  of  arriving  at 

" (KK11-"1 


-m 'l-q) N'm 


The  expected  value  of  p(H^|d) 
is  given  by 


a node  via  any  such  path  by 

(13) 

after  N observations , then. 


F 


(HjDlj 


N 

^ N ,m  ,m  , 

m=U 


(14) 


where  P represents  the  posterior  probability  of  H_j  after  N 
observations,  m of  which  have  yielded  red  chips. 
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The  following  iterative  formula  can  be  used  to  compute  ff  : 


*tt  — Q 

” N,m+1  (m+1) (1-q)  ''  N , 


m 


where 


(15) 


7TN(o  = (1-q^  ' 


(16) 


and  computation  can  be  simplified  by  taking  logarithms: 


logTTN,m+l  = log  X + 109  ^N, 


m 


(17) 


where 


x . , j lai  , 


(m+1 ) (1-q) 


(18) 


and 


iog  7rNr0  = N log  ' 


(19) 


The  value  of  g in  equation  (13)  depends,  of  course,  on  which 
of  the  hypotheses  under  consideration  happens  to  be  true.  The 
expectation  can  be  computed,  however,  for  each  of  the  possibilities. 
The  general  expression  may  be  written  as  follows: 
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£ 

H, :70M  R 

^2 

HZ:20J»  R 

xyr 

y 

K 

Figure  9.  Graph  illustrating  the  computation  of  expected 
values  of  posterior  probabilities 
(The  expression  above  each  node  represents 
the  probability  of  arriving  at  that  node,  given 
q is  the  probability  of  drawing  a red  chip.  The 
expected  value  of  the  posterior  probability 
following  N observations  is  the  sum  of  the  values 
of  the  nodes  above  N,  each  weighted  by  its 
"arrival"  probability.) 
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Figure  10  shows  E[pN(Hi|D)]  for  our  example  : 70%  red, 

H2:  20%  red),  given  is  true  (top  curve)  and  given  is  not  true 
(bottom  curve) . The  top  curve  shows  the  expected  growth  of 
p ( H 1 1 D ) when  is  true  and  of  p(H2|D)  when  H?  is  true.  Conversely, 
the  bottom  curve  represents  the  expected  decline  of  p(K^|D)  when 
H 2 is  true  and  of  p(H2|D)  when  H ^ is  true.  Thus,  in  the  two- 
alternative  case, 


^ ptHjjD)  |h 


L is  true  = E p(H2 |D)  H2  is  true 


To  compute  the  expected  uncertainty  following  N o:  servations, 
one  must  compute  the  uncertainty  associated  with  each  of  the  possible 
outcomes  of  the  observations , and  then  take  a weighted  average  of 
these  uncertainties , the  weights  being  the  probabilities  of  occur- 
rence of  the  specific  outcomes.  The  uncertainty  associated  with  a 
specific  outcome,  say  the  outcome  N observations  yielding  m red 
chips,  is  given  by 


I,  pi ;N,m  1o«2  Pi 


i;N,m  , 


where  h is  the  number  of  hypotheses  under  consideration  and  P. 
is  the  probability  associated  with  the  ifch  hypothesis  after  ' ' 

N observations  yielding  m red  chips.  The  expected  uncertainty 
after  N observations,  then,  ir.  obtained  by  weighting  each  u„, 
by  its  probability  of  occurrence,  and  summing  over  all  ^ 

possible  outcomes.  Thus, 

E(V  “ £n^N,mUN,m  ' (23) 

m=0 

where  77\,  _ is  defined  as  before.  Again,  inasmuch  as  the  value 

of  q in  equation  (13)  depends  on  which  hypothesis  is  true,  the 
general  expression  for  E(U  ) conditional  upon  which  hypothesis 
is  true  may  be  written  w 


E(UN|H.  is  true)  = £ (£)  p(R|H.)mP(B|H.)N-inUN>ra-  (24) 
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H1:  70%  Red;  H2:  20%  Red; 
pn(H.)  = .5 


Figure  10.  Expected  value  of  posterior  probability 

given  Hj  is  true,  as  a function  of  number 
of  observations 
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The  computation  of  expected  uncertainty  relates  to  figure  8 
in  the  following  way.  Imagine  that  such  an  outcome  graph  were 
developed  for  each  of  the  hypotheses  under  consideration,  con- 
ditional upon  the  truth  of  a specified  hypothesis.  The  value  of 
UN  is  found  by  summing  -p  log_p  across  graphs  for  given  N,m, 
tne  values  of  p being  the  values  of  the  nodes  on  the  graphs 
(equation  22).  The  value  of  E (U  ) is  then  obtained  by  weighting 
each  of  these  sums  by  the  probability  of  arriving  at  node  N,m, 
given  the  truth  of  the  specified  hypothesis,  and  summing  over  m 
(equation  23).  Figure  11  shows  how  the  expected  uncertainty  con- 
cerning which  of  the  two  hypotheses  is  true  changes  as  a result  of 
observations  in  the  case  of  our  example  (H  : 70%  red,  H : 20%  red), 
given  the  truth  of  each  hypothesis  in  turn. 

The  examples  that  we  have  been  considering  have  ail  converged 
rather  quickly  to  a state  of  relatively  low  uncertainty.  This  was 
due  to  the  fact  that  and  H_  were  quite  disparate.  3ut  suppose 
H.  and  H2  were  similar  with  respect  to  their  implications  for 
data.  Suppose,  for  example,  that  we  let  H.  be  the  same  as  before 
(that  is,  that  70%  of  the  chips  are  red)  and  H be  the  hypothesis 
that  60%  of  the  chips  are  red.  Again,  setting^the  initial  proba- 
bilities equal  to  .5  and  assuming  the  same  sequence  of  observations 
as  indicated  in  figure  1,  figures  12  and  13  show  the  effects  of 
these  observations  on  the  distribution  of  probabilities  over  the 
two  hypotheses,  and  on  uncertainty.  Figures  14  and  15  show  the 
expected  effects  of  data  on  posterior  probabilities  and  uncertainty 
for  this  case.  Obviously,  the  expected  effects  of  observations 
are  much  smaller — the  data  have  less  diagnostic  impact — when  the 
hypotheses  are  similar  than  when  they  are  very  different.  Or,  to 
say  the  same  thing  in  other  words,  a larger  sample  is  needed  to 
produce  the  same  degree  of  certainty  with  respect  to  which  hypo- 
thesis is  true.  This  illustrates  the  intuitively  compelling  idea 
that  the  smaller  the  differences  between  two  statistical  distribu- 
tions, the  closer  one  must  examine  them  to  tell  which  is  which. 
Continued  sampling  will  eventually  make  the  probabilities  diverge 
and  the  uncertainty  decrease,  assuming,  of  course,  that  the  samp- 
ling is  random  and  one  of  the  hypotheses  is  in  fact  true.  Figures 
16  and  17  show  the  expected  changes  in  p(ft  |d)  and  uncertainty 
over  the  course  of  200  observations,  given^H, : 70%  red,  H : 60%  red, 
p« (H . ) = .5.  Two  hundred  observations  would  not,  on  the  average, 
reduie  the  uncertainty  in  this  case  by  the  amount  that  ten  ob- 
servations would  reduce  it,  given  the  more  disparate  hypotheses, 

: 7 0%  red  and  : 20%  red. 

Table  2 (page  95)  shows,  for  various  combinations  of  H.  and  ft, 
the  expected  posterior  probability  of  H.  after  ten  observations,  given 
that  chips  are  sampled  from  an  urn  containing  reds  and  blacks  in 
the  proportions  specified  by  ft  , and  assuming  (H^ ) = p ( H 2 ) = .5. 
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H : 70%  Red;  H2:  20%  Red; 


Figure  11.  Expected  uncertainty  concerning  which 
hypothesis  is  true,  as  a function  of 
number  of  observations 
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DATUM 


100  100  67  60  60  67  69  62  67  70 

PERCENT  OF  REDS  IN  SAMPLE 


H^:  70%  Red;  60%  Red; 

P0(Hi5  = *5 


Figure  13.  Changes  in  uncertainty  concerning 

hypotheses  as  a result  of  the  indicated 
observations  of  Red  and  Black  chips 
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ITS) 


N 


H1 : 70%  Red;  H2:  60%  Red; 

P0(Hi)  = .5 

Figure  15.  Expected  uncertainty  concerning  which 
hypothesis  is  true,  as  a function  of 
number  of  observations 
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Z0  40  60  80  100  120  140  160  160  200  220 

N 


H,  : 70%  Red;  H : 60%  Red; 

p,<V '-  ■» 

Figure  17.  Expected  uncertainty  concerning  which 
hypothesis  is  true,  as  a function  of 
number  of  observations 


Percentage  of  Reds  according  to 
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TABLE  2.  SENSITIVITY  OF  BAYESIAN  ANALYSIS. 


Percentage  of  Reds  according  to 


00 

10 

20 

30 

40 

50 

60 

70 

80 

90 

100 

00 

.50 

.74 

.90 

.97 

.99 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

10 

.74 

.50 

.59 

.72 

.83 

.90 

.95 

.98 

.99 

1.00 

1.00 

20 

.90 

.59 

.50 

,56 

.67 

.79 

.87 

.94 

.97 

.99 

1.00 

30 

.97 

.72 

.56 

.50 

.55 

.66 

.77 

.87 

.94 

.98 

1.00 

40 

.99 

.83 

.67 

.55 

.50 

.55 

.65 

.77 

.87 

.95 

1. 00 

50 

1.00 

.90 

.79 

.66 

.55 

.50 

.55 

.66 

.79 

.90 

1.00 

60 

1.00 

.95 

.87 

.77 

.65 

.55 

.50 

.55 

.67 

,83 

.99 

70 

1.C0 

.98 

.94 

.87 

.77 

.66 

.55 

.50 

.56 

.72 

.97 

80 

1.00 

.99 

.97 

.94 

.87 

.79 

.67 

.56 

.50 

.59 

.90 

90 

1.00 

1.00 

.99 

.98 

.95 

.90 

.83 

.72 

.59 

.50 

.74 

100 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

.99 

.97 

.90 

.74 

.50 

Cells  represent  expected  values  of  posterior  probabilities 
associated  with  correct  hypothesis  after  10  draws  from  (either) 
one  of  the  urns. 
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There  are  several  things  to  notice  about  this  table.  First, 
it  represents  the  expected  value  of  the  probability  associated  with 
the  true  hypothesis,  so  it  represents  p(ri^[o)  if  data  are  sampled 
from  an  urn  for  which  Hi  is  true,  and  p(H2|D)  if  the  sample  is 
taken  from  an  urn  for  which  H2  is  true.  A second  thing  to  notice 
about  the  table  is  the  fact  that  it  is  symmetric  about  the  minor 
diagonal.  Thus,  for  example,  the  probability  associated  with  the 
true  hypothesis  is  the  same  for  Hp  x%  red,  Hj:  y%  red  as  for 
Hp  y%  red,  H2 : x%  red.  This  is  a trivial  point,  and  simply 
indicates  that  the  expected  effect  of  a sequence  of  observations 
is  strictly  a function  of  the  diagnosticity  of  data  and  is  inde- 
pendent of  which  hypothesis  is  which.  Third,  except  when  one  of  the 
hypotheses  is  extreme  (say,  hypothesizes  that  10%  or  less,  or  90% 
or  more,  of  the  chips  are  of  one  color) , the  expected  impact  of 
data  is  largely  a function  of  the  difference  between  the  hypothe- 
sized percentages  and  relatively  independent  of  their  a'  solute 
magnitudes. 

It  was  pointed  out  above  that  for  various  combinations  of 
and  H2,  the  first  one  or  two  observations  can  have  a remarkably 
large  effect.  How  much  effect  they  will  have  depends,  however, 
on  what  those  observations  are  and  on  the  disparity  between 
and  H2.  This  point  is  illustrated  by  figure  18.  The  figure  shows 
the  probability  of  Hp  given  a single  observation  that  yields  a 
red  chip.  In  all  cases,  it  is  assumed  that  the  hypotheses  were 
equally  probable  before  the  observation.  Note  that  if  the  hypotheses 
are  disparate,  for  example,  Hp  90%  red  and  Hp  10%  red,  or  Hp  10% 
red  and  II2:  90%  red,  a single  observation  will  change  the  proba- 
bilities associated  with  and  H2  from  .5  and  .5  to  .9  and  .1, 
or  to  .1  and  .9.  On  the  other  hand,  if  the  initial  probabilities 
are  very  close,  say  .5  and  .6,  a single  observation  will  change 
them  very  little. 

8.6.8  The  Symmetrical  Two-Hypothesis  Case 

A two-hypothesis  case  of  special  interest  is  that  for  which 
one  of  two  possible  observations  has  the  same  probability  given 
one  hypothesis  as  does  the  other  observation  given  the  other 
hypothesis.  That  is,  we  are  concerned  with  the  situation  in  which 
p(Da|H^)  = p ( D ^ 1 H 2 ) # or  equivalently,  in  which  p(D  jH^)  = l-p(D?Jn9). 

This  is  sometimes  referred  to  as  the  " symmetrical " case,  reflecting 
the  fact  that  one  of  the  two  possible  observations  provides  exactly 
as  much  support  for  one  of  the  hypotheses  as  does  the  other  obser- 
vation for  the  other  hypothesis.  This  situation  holds  in  the 
chips-in-urn  context  when  both  hypotheses  involve  the  same  pro- 
portional split  of  chips  of  different  colors,  but  one  identifies 
red  chips,  and  the  other  black  chips,  as  being  the  more  numerous. 
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PROPORTION  OF  REDS  ACCORDING  TO  H 


P0{V  = P0(h2)  = .5 


Figure  18.  The  effect  cf  drawing  a single  Red  chip, 
given  various  combinations  of  prior 
hypotheses  concerning  the  proportion 
of  Reds  in  the  urn 
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The  hypothesis  pair  H.:  70%  R,  304  B;  H_:  30%  R,  70%  B satisfies 
this  condition,  for  example;  whereas  the  pair  H. : 70%  R,  30%  B; 

20%  R,  80%  B does  not. 

The  symmetrical  case  is  of  special  interest  because  of  the 
fact  that  the  effect  of  a series  of  observations  on  the  odds  favor- 
ing one  hypothesis  over  the  other  can  be  calculated  in  a trivially 
simple  way.  If  represents  the  odds  prior  to  the  observations 
of  interest,  and  L represents  the  likelihood  ratio,  then 


where  d represents  the  difference  between  the  number  of  observa- 
tions of  (say,  red  chips)  and  of  D (say,  black  chips) , and 
fl.  represents  the  odds  following  the  observations.  Note  that  the 
size  of  the  sample  — the  number  of  observations--does  not.  enter  into 
this  calculation.  Suppose,  for  example,  that  S2g  = 1 and  u = 3 
(as  would  be  the  case  if  p(R|H1)  = .75  and  p(R|h  ) = .25,  and  the 
odds  and  likelihood  ratio  were  expressed  H.  relative  to  H^),  then, 
given  a sequence  of  observations  yielding  Tour  more  red  chips 
than  black  chips,  the  posterior  odds  would  be 

* 34*1  = 81,  (26>  ) 

and  the  same  result  would  hold  whether  the  difference  of  four  was 
obtained  from  a sample  containing  8 reds  and  4 blacks  or  one  con- 
taining 100  reds  and  96  blacks. 

The  exclusive  dependence  of  on  d follows  directly  from 
the  fact  that  the  likelihood  ratioufor  one  of  the  two  possible  ob- 
servations is  the  reciprocal  of  that  for  the  other  observation. 

Recall  from  equation  (11)  that  the  posterior  odds  following  a 
single  observation  is  simply  tne  prior  odds  multiplied  by  tne 
likelihood  ratio  associated  with  the  observation 

S2  = . . (11) 

n n-1 

Recall,  too,  however,  that  the  likelihood  'atio  is  conditional  on 
the  observation.  Thus,  if  is  observed, 

p (D  | H,  > 


whereas,  if  is  observed, 

PtDjH.  ) 

L = pCd^h^t’ 


(27) 


(28) 


9 8 


NAVTRAEQUIPCEN  73-C-0128-1 


Lotting  La  represent  the  likelihood  ratio  when  Da  is  observed, 
and  Lg  the  likelihood  ratio  when  Dg  is  observed,  we  may  represent 
the  effect  of  a specific  sequence  of  observations,  say 


D 

a 


D D D D D 

a 3 a 3 a 


a 


n 


on  the  odds  as 


6 


(29) 


In  the  symmetrical  case,  however, 

L = L-1  , 

3 a 


(30) 


so  the  effect  of  the  same  specific  sequence  of  observations  may 
be  written  as 


n 


L 


4 

a 


°„-6 


(31) 


and  in  general 


n a n-k 


-if  ' 


(32) 


where  d is  the  number  of  observations  of  D0  minus  the  number  of 
observations  of  Dg.  But,  inasmuch  as  neither  n nor  k is  used  in 
the  calculation,  we  may  express  n as  a function  of  d,  and  write 
the  expression  as  in  equation  (25). 

We  see  then  that  in  the  symmetrical  case,  the  odds  increase 
exponentially  with  the  difference  between  the  number  of  observa- 
tions of  the  one  type  and  that  of  the  other  type  that  have  been 
obtained.  Figure  19  shows  how  the  rate  of  growth  of  this  function 
depends  on  the  disparity  between  the  conditional  probabilities,  or, 
equivalently,  on  the  size  of  the  likelihood  ratio.  Figure  20 
3hows  how  the  size  of  the  difference  that  is  required  to  realize 
a given  odds  varies  with  the  larger  of  the  conditional  probabilities 
of  which  the  likelihood  ratio  is  comprised.  The  finding  that  people 
typically  tend  to  be  conservative  Bayesians  in  their  use  of  data 
to  revise  their  estimates  of  the  likelihoods  of  the  possible  states 
of  the  world  suggests  that  many  people  would  find  the  relationships 
that  are  shown  in  these  figures  to  be  counterintuitive.  The  fact. 
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The  rate  of  growth  in  posterior  odds  as 
function  of  the  difference,  d,  between 
the  number  of  observations  favoring 
and  the  number  favoring  H, , in  the  J 
symmetrical  case  (The  parameter  is 
likelihood  ratio,  2) 
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for  example,  that  with  conditional  probabilities  of  p(Dr  | ) = .7 
and  = .3,  a sample  that  contains  three  more  observations  of 

Da  than  of  Dg  will  favor  H.  over  H by  a factor  of  more  than  10 
may  be  surprising;  as  may  the  fact  that  with  a difference  of  six, 
the  odds  are  greater  than  100  to  1. 

Another  aspect  of  the  symmetrical  case  that  some  readers  may 
find  counterintuitive  is  the  fact  that  the  total  effect  of  a series 
of  observations  on  the  odds  depends  only  on  the?  difference  between 
the  number  of  observations  of  the  two  types  and  is  independent  of 
the  total  number  of  observations  made.  Both  intuition  and  statis- 
tical training  suggest  that  one's  confidence  in  any  inference  that 
is  to  be  drawn  from  the  properties  of  a sample  should  increase  with 
the  sample  size.  The  apparent  paradox  is  resolved  by  a recognition 
of  the  fact  that,  except  under  the  hypothesis  that  each  observation 
is  equally  likely,  the  absolute  difference  (though  not  i lie  relative 
difference)  between  the  frequencies  of  occurrence  of  the  Lwo  types 
of  observation  is  expected  to  increase  with  sample  size.  Speci- 
fically, if  H:  x%  R,  (l-x)%  B is  true,  the  difference  between  the 
number  of  Rs  and  Bs  in  a sample  of  size  N should  be  (2x-l)N,  on 
the  average. 

Consider,  for  example,  the  symmetrical  hypotheses  H : 70%  R, 

30%  B and  H^:  !0%  R,  70%  B.  If  H.  were  true,  samples  of  ■‘‘ten  draws 

would  be  expected  to  produce  four  more  reds  than  blacks  on  the 
average;  and  the  odds  following  a ten-draw  sample  with  four  more 
reds  than  blacks  would  be  about  30  to  1 in  favor  of  H..  Samples  of 
100  draws,  given  H. , should  produce  40  more  reds  thaniblacks , on 
the  average,  a difference  that  would  drive  the  odds  to  more  than 
523  trillion  to  1.  Thus,  the  odds  do  tend  to  increase  with  sample 
size  because  d tends  to  increase  with  sample  size.  A sample  of 
100  draws  that  produced  four  more  reds  than  blacks  would  be  quite 
unlikely  if  were  true,  and  thus  would  not  constitute  strong 
evidence  in  favor  of  that  hypothesis.  It  would  be  even  less 
likely,  however,  if  H2  were  true,  so  it  does  constitute  sonic 
evidence  for  H^,  but  only  as  much  as  one  would  expect  to  obtain 
from  a much  smaller  Bample.  Table  3 showB  the  odds  favoring  H. , 
given  various  combinations  of  p(DjH^)  and  p { D | H ) and  several 
values  of  d.  1 L 

8.6.9  The  Several-Hypothesis  Case 

So  far,  the  examples  that  we  have  considered  to  illustrate 
the  tise  of  Bayes  rule  have  involved  only  two  hypotheses.  We  turn 
now  to  consideration  of  a few  cases  in  which  there  are  more  than 
two  hypotheses.  Figure  21  illustrates  a case  in  which  Hi,  H2 
and  Hi  represent,  respectively,  the  hypotheses  that  the  percentage 
of  red  chips  in  the  urn  is  90,  70  and  50,  a> d shows  how  the  pos- 
terior probabilities  associated  with  these  hypotheses  would  change 
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TABLE  3.  ODDS  FAVORING  Hx  GIVEN  THE  INDICATED  VALUES  OF 
p(D|H1) , p(D|H2)  AND  d. 


d 


(D | H1) 

p(d|h2) 

L 

1 

2 

4 

8 

16 

32 

.55 

. 45 

1 . 22 

1.  2 E0 

1.  5E0 

2.  2E0 

4 . 9 E0 

2. 4E1 

5 . 8 E 2 

.60 

. 40 

1.50 

1. 5E0 

2. 3E0 

5.  1E0 

2 . 6E1 

6. 6E2 

4.  3E5 

. 65 

. 35 

1 . 85 

1. 9E0 

3 . 4E0 

1.  2E1 

1. 4E2 

1. 9E4 

3 . 5E8 

.70 

. 30 

2.33 

2 . 3E0 

5. 4E0 

3. 0E1 

8 . 8E2 

7 . 7E5 

6.0E11 

.75 

.25 

3.00 

3. 0E0 

9.  OEO 

8 . IE  1 

6.6E3 

4 3E6 

1. 9E15 

.80 

.20 

4.  or 

4. 0E0 

1 . 6E1 

2. 6E2 

6. 6E4 

4. 3E9 

1. 8E19 

.85 

. 15 

5.6  7 

5. 7E0 

3. 2E1 

1. 0E3 

1. 1E6 

1.1E12 

1. 3E24 

. 90 

.10 

9.00 

9 . OEO 

8. 1E1 

6 . 6E3 

4.3E7 

1.9E15 

3 . 4 E 3 0 

.95 

.05 

19.00 

1. 9E1 

3.6E2 

1.  3 E5 

1. 7E10 

2. 9E20 

8. 3E40 

All  odds  values  are  rounded  to  two  significant  digits  and  expressed 
in  exponential  form.  To  obtain  the  approximate  value  of  ft,  multiply 
the  number  to  the  lelt  of  the  E by  ten  raised  to  the  power  indicated 
by  the  number  to  the  right  of  the  E.  For  example,  4.3E7  = 4.3  x 107). 
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OBSERVATION  NUMBER 


RRBBRRBRRR 

DATUM 

10B  100  67  50  60  67  59  62  67  70 

PERCENT  OF  REDS  IN  SAMPLE 

H^:  90%  Red;  H^;  70%  Red; 

H3:  50%  Red;  pQ (H± ) = .333 

Figure  21.  Changes  in  posterior  probabilities  p ( | D ) 
as  a result  of  the  indicated  observations 
of  Red  and  Black  chips 
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as  a result  of  the  indicated  sequence  of  observations,  given  p ( h . ) 
* .333.  Figure  22  shows  the  change  in  uncertainty  concerning  0 1 

which  hypothesis  is  correct  as  a result  of  the  same  sequence  of 
observations. 


When  only  two  hypotheses  are  under  consideration,  there  is 
only  one  odds  ratio  (or  its  reciprocal)  that  can  be  expressed. 

The  number  of  odds  ratios  that  can  be  expressed  grows  rapidly, 
however,  as  the  number  of  hypotheses  is  increased  beyond  two.  In 

general,  given  N hypotheses,  there  arej^j  or  N(N-l)/2  odds  that 


can  be  expressed  considering  only  pairs  of  hypotheses, 
three-hypothesis  case  we  might  consider  2'  fil  3 °r  ft 

of  which  is  shown  in  figure  23  for  our  example. 


Thus  in  the 

, o , each 
• > 5 


It  may  be  of  interest  to  consider  other  than  pairwise  odds 
ratios  in  the  several-alternative  case,  however.  Given  five 
hypotheses,  for  example,  one  might  wish  to  consider  the  odds  of 
H3  relative  to  the  combination  of  H3  and  H4 , which  would  be  ob- 
tained by  taking  the  ratio  of  p(HjJb)  to  the  sum  of  p(H3|D)  and 
p(H4|D) . It  may  often  be  of  particular  interest  to  consider  the 
odds -of  a given  hypothesis,  relative  to  all  the  remaining 
hypotheses  in  combination.  Such  an  odds  would  give  the  ratio  of 
the  probability  that  Hf  is  true  to  the  probability  that  one  of 
the  remaining  hypotheses  is  true,  i.e.,  that  H ^ is  false.  We 
might  refer  to  such  an  odds  as  the  absolute  odds  of  and  repre- 
sent it  as  follows: 


p(Hi|D)  p(Hi|D) 

Ui,l  = £ P (H  j | D)  " 1-pflljD) 

j . j?*i 


Figure  24  shows  how  the  indicated  observations  affect  the  ab- 
solute odds  of  each  of  the  hypotheses  of  our  example. 

Expected  values  of  posterior  probabilities  and  of  uncertainty 
may  be  calculated  in  the  same  way  when  there  are  several  hypotheses 
as  when  there  are  only  two.  An  outcome  graph  such  as  those  shown 
in  figures  8 and  9 could  be  used  to  specify  all  possible  posterior 
probabilities  for  a given  hypothesis,  H.,  and  their  probabilities 
of  attainment  on  the  assumption  that  a specified  hypothesis,  H.  is 
true.  The  weighted  sum  of  the  nodes  above  a particular  value  ^ 
of  N would  represent,  as  before,  the  expected  value,  after  N obser- 
vations, of  the  posterior  probability  of  H • , given  that  is  really 
true.  Also  as  before,  computation  of  expected  uncertainty  involves 
summing  over  both  i and  m,  for  given  N . Inasmuch  as  it  is  possible 
to  compute  an  expectation  of  p(H. | D ) given  that  H.  is  true  for  all 
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3 4 5 6 7 8 

OBSERVATION  NUMBER 


0 B 


DATUM 


100  100  67  50  60  67  59  62  67 

PERCENT  OF  REDS  IN  SAMPLE 


: 90%  Red;  H2:  70%  Red; 


H3:  50%  Red;  pQ  (H..)  = .333 


Figure  22.  Changes  in  uncertainty  as  a result  of 
indicated  observations 


OBSERVATION  NUMBER 


RRS8RRBRRR 

DATUM 

100  100  67  £0  60  67  69  62  67  7( 

PERCENT  OF  REDS  TN  SAMPLE 

H^:  90%  Red;  H2 : 70%  Red; 

H, : 50%  Red;  pn(Hj  = .333 


Figure  23.  Changes  in  pairwise  odds  as  a result  of 
the  indicated  observations 
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0 1 2 3 A 5 6 7 


observation  number 

RRBBRRBRRR 

DATUM 

100  100  67  50  60  07  69  62  67  70 

PERCENT  OF  REDS  IN  SAMPLE 

Hxs  90%  Red;  H2:  70%  Red; 

: 50%  Red;  p„(H.)  = . iVl 
-5  V i 

Figure  24.  Changes  in  absolute  odds  for  each  hypothes 
as  a result  of  the  indicated  observations 
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possible  combinations  of  values  of  i and  j,  the  number  of  outcome 
graphs  that  could  be  of  interest  increases  with  the  square  of  the 
number  of  hypotheses  under  consideration. 

Table  4 shows  the  expected  values  of  ( H ^ | D ) and  U , for 

N = 1,2,  ...10,  given  that  Hj  is  true  for  all  combinations  of  i 
and  j in  the  case  of  our  example  (H. : 90%  red,  H^:  70%  red,  H, : 50% 
red,  t'  (H.)  = . 333).  As  might  be  expected,  given  that  Hj  is  true, 
E[p(H. |D)j  increases  most  rapidly  when  i = j ; which  is  to  say, 
the  expected  value  of  the  probability  associated  with  the  true 
hypothesis  grows  faster  than  that  of  the  probability  associated 
with  either  false  hypothesis.  Counter  to  intuition,  however,  this 
is  not  a necessary  condition.  An  example  will  be  considered 
presently  in  which  the  expected  probability  associated  with  a false 
hypothesis  grows,  for  a while,  at  a greater  rate,  than  does  the 
expected  probability  associated  with  the  true  hypothesis,  even 
when  both  hypotheses  are  equally  probable  a priori.  > ith  continued 
sampling,  however,  the  probability  of  the  true  hypotheses  eventually 
gets  larger  than  that  of  any  of  the  false  hypotheses.  Another  point 
of  interest  concerning  table  4 is  the  fact  that  each  of  three 
columns  of  values  occurs  twice:  the  second  and  fourth  columns 

are  identical,  as  are  the  third  and  seventh,  and  the  sixth  and  eighth. 
This  illustrates  the  following  relationship: 

E[p(Hi|D) |Hj  is  true]  = E [p (H ^ | D) | is  true],  (34) 

that  is,  the  expected  posterior  probability  of  Hj_,  given  that  H. 
is  true,  is  the  same  as  the  expected  posterior  probability  of  3 
H j , given  that  is  true.  This  relationship  holds  in  general,  and 
independently  of  the  number  of  hypotheses  under  consideration . 

As  in  the  two-alternative  case,  the  rate  at  which  the  expected 
values  of  the  posterior  probabilities  approach  one  or  zero--and, 
consequently,  the  rate  at  which  uncertainty  is  expected  to  decrease — 
depends  on  the  disparity  among  the  hypotheses.  The  point  is  illus- 
trated in  table  5,  which  shows  all  values  of  E [p . j,  (H  . | D)  | H j is 
true]  for  two  sets  of  hypotheses:  Hi,  H2  and  H3:  90,  v0  and  50% 
red,  and  90,  60  and  3 0%  red.  The  table  also  shows  tlre^  expected 
uncertainty  after  ten  observations,  E(Uiq),  concerning 'which 
hypothesis  is  true,  as  a function  of  which  hypothesis  actually  is 
true. 

Table  6 shows  E Ipio  (Hi  | D)  | H j is  true]  and  E ( U 1 q ! H j is  true) 
for  two  sets  of  five  hypotheses.  This  table  illustrates  some  of  the 
same  points  as  does  table  4.  The  rate  at  which  the  probabilities 
change  from  their  original  values,  and  the  rate  at  which  uncertainty 
decreases  depend  on  the  disparity  among  the  hypotheses.  The  value 
of  E [p  (Hi  ] D)  j H-j  is  true]  is  always  equal  to  that  of  E [p  (Hj  | D)  | Hi 
is  true] , whicn  is  seen  by  the  fact  that  each  array,  if  considered 
as  a matrix,  is  equal  to  its  transpose. 
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TABLE  4.  EXPECTED  VALUES  OF  POSTERIOR  PROBABILITIES,  AND 

UNCERTAINTY  (IN  BITS),  GIVEN  THAT  CHIPS  ARE  SAMPLED 
FROM  THE  URN  FOR  WHICH  THE  INDICATED  HYPOTHESIS 
IS  TRUE. 

True  Hypothesis 


Hypothesis  for  which  Expectation  Computed 


H1 

H2 

H3 

E (U) 

H1 

h2 

»3 

E(U) 

H1 

H2 

H3 

E(U) 

1 

. 397 

. 333 

.270 

1.  53 

.333 

. 333 

. 333 

1.49 

. 270 

. 333 

. 397 

1.45 

2 

.453 

. 327 

.220 

1.44 

. 327 

.338 

. 334 

1.41 

.220 

. 334 

.446 

1.35 

3 

. 501 

. 319 

.180 

1.  35 

.319 

.347 

.334 

1.  36 

.180 

.334 

. 486 

1.26 

4 

. 542 

.310 

.149 

1.26 

.310 

. 358 

. 332 

1.  32 

.149 

. 332 

. 519 

1.19 

5 

.577 

. 300 

.123 

1.18 

.300 

. 370 

. 330 

1.28 

.123 

. 330 

. 547 

1.12 

6 

. 607 

.291 

.102 

1.  10 

.291 

. 383 

. 326 

1.25 

. 102 

. 326 

. 572 

1.06 

7 

.634 

. 281 

. 084 

1.  03 

. 281 

.396 

.322 

1.22 

.084 

. 322 

. 593 

1.00 

8 

. 658 

. 272 

.070 

0.96 

.272 

.411 

.318 

1.19 

. 070 

. 318 

.612 

0.95 

9 

.680 

. 262 

.058 

0.90 

.262 

.425 

.312 

1.17 

.058 

.312 

.630 

0.91 

10 

. 699 

.253 

.048 

0.85 

.253 

.440 

.307 

1.14 

.048 

. 307 

.645 

0.87 

: 90% 

red , 

K2:  70%  red,  H3 : 

50%  red;  PqUL) 

- .333. 

(Note : 

expected  uncertainty, 

E(U)  , 

is  not  the  same 

as  the 

uncertainty  calculate^  from  the  expected  posterior  probabilities. 
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TABLE  5.  EIPj^qU^ID)  jHj  IS  TRUE]  FOR  ALL  COMBINATIONS  OF 
i and  j AND  THE  TWO  INDICATED  HYPOTHESIS  SETS. 

H^  90%  Red,  H2 : 70%  Red,  H3:  50%  Red 


0 


1 

2 

3 

1 

.699 

.253 

. 048 

i 

2 

.253 

.440 

.307 

3 

.048 

.307 

.645 

E(U) 

0.85 

1.14 

0.87 

bits) 

: 90%  Red, 

H2  : 60%  Red, 

H3:  30%  Red 

j 


X 2 3 


1 

.824 

. 171 

. 005 

i 2 

.171 

. 603 

.226 

3 

.005 

.226 

.769 

E(U) 

0.4  8 

0.86 

0.55 

(in  bits) 

In  both 

cases  pQ  (IF  ) = 

. 333. 
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TABLE  6.  E[pl0  (H±  | D)  | H_.  IS  TRUE]  FOR  TWO  FIVE-HYPOTHESIS  SETS. 


90r  80,  70,  60,  50%  Red,  respectively 

j 


1 

2_ 

3 

4 

5 

1 

.475 

.282 

.148 

.068 

. 026 

2 

.282 

.271 

.216 

.146 

. 085 

i 3 

.148 

.216 

.239 

.221 

. 176 

4 

. 068 

.146 

.221 

.273 

.292 

5 

. 026 

.085 

.176 

.292 

.421 

E(U) 

(in  bits) 

1.64 

1.90 

1.96 

1.86 

1.6  7 

w 

90,  75, 

60,  45,  30% 

Red,  respectively 

1 

2 

j 

3 

4 

5 

1 

. 604 

.280 

. 094 

.021 

.002 

2 

.280 

.337 

.240 

.112 

.031 

1 3 

.094 

.240 

.300 

.240 

.126 

4 

.021 

.112 

.240 

.323 

. 304 

5 

.002 

.031 

.126 

. 304 

.537 

E (U) 

(in  bits) 

1.19 

1.  63' 

1.75 

1.63 

1.30 

Tn  paoao  / u \ — o 

- . 4- 
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For  the  hypothesis  set  represented  by  the  bottom  half  of 
table  6,  it  is  true  that  the  expectation  is  maximum  when  i = j, 
which  is  to  say  that  after  ten  observations  the  probability  of 
the  true  hypothesis  is  always  larger  than  that  of  any  of  the  false 
ones.  Note,  however,  that  this  property  does  not  characterize  the 
values  for  the  hypothesis  set  represented  by  the  top  half  of  the 
table.  In  particular,  given  this  hypothesis  set,  the  expected 
probability  of  H.  is  greater  than  that  of  H„  after  ten  observations, 
even  if  chips  are  drawn  from  an  urn  for  which  H„  is  true.  Similarly, 
E(p. (H.  |D)  ] is  greater  than  E[plQ(H  |D)]  when  R4  is  true.  With 
continued  sampling  the  expected  probability  of  the  true  hypothesis 
will  continue  to  grow,  finally  approaching  one,  whereas  that  of 
each  of  the  false  hypotheses  will  at  some  point  begin  to  decrease 
and  will  eventually  approach  zero.  The  fact  that  the  expected 
value  of  the  probability  of  a false  hypothesis  is  higher  at  any 
time  than  that  of  the  true  hypothesis  may  be  quite  counterintuitive, 
however.  Figure  25  shows  the  way  in  which  the  expected  values 
of  each  of  the  posterior  probabilities  of  the  example  represented 
in  the  top  half  of  table  6 change  over  twenty  observations , given 
that  is  really  true.  Note  that  p(H2|D)  is  initially  smaller 
than  p (H  | D) , but  eventually  overtakes  and  surpasses  it;  with 
further  Sampling  p(H_|D)  would  continue  to  increase,  whereas 
p (H^ | D)  would  decrease. 

A comparison  of  tables  5 and  6 illustrates  several  additional 
points.  The  hypothesis  sets  represented  in  table  5 are  contained 
within  those  represented  in  table  6.  Considering  only  those 
hypotheses  that  are  represented  in  both  tables,  it  may  be  seen 
that  the  expected  posterior  probabilities  associated  with  hypotheses 
within  the  smaller  set  are  invariably  larger  than  those  associated 
with  the  same  hypotheses  within  the  larger  set.  It  may  also  be 
seen  that  the  expected  amount  of  uncertainty  remaining  after  ten 
observations,  given  the  truth  of  a specific  hypothesis,  is  greater 
when  the  hypothesis  set  contains  five  alternatives  than  when  it 
contains  three.  Of  course,  the  a priori  uncertainty  is  also 
greater  in  the  former  case  (2.32  bits  ver: us  1.58  bits),  so  what 
is  of  greater  significance  is  the  fact  that  a larger  proportion  of 
the  original  uncertainty  is  resolved  in  the  three-alternative  case. 
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8.6.10  Man  as  a Bayesian  Hypothesis  Evaluator 

A considerable  amount  of  experimentation  has  been  done  to 
determine  how  well  Bayes  rule  predict:-  behavior  when  an  indivi- 
dual attempts  to  process  probabilistic  information  in  situations 
like  those  illustrated.  For  example,  given  the  task  of 
deciding,  on  the  basis  of  a sequence  of  observations,  which  of 
several  hypotheses  about  the  nature  of  the  source  of  those  ob- 
servations is  true,  how  closely  will  the  estimates  produced  by  the 
human  decision  nw-ker  correspond  to  those  produced  by  the  application 
of  Bayes  theorem?  Obviously,  in  situations  as  highly  structured  as 
those  described,  it  would  be  of  little  interest  to  do  such  experi- 
ments with  an  individual  who  understood  Bayes  rule  and  was  permitted 
to  do  the  calculations  necessary  to  use  it.  Such  a test  would  do 
nothing  but  demonstrate  one's  ability  to  do  arithmetic.  Experi- 
ments on  Bayesian  information  processing  typically  are  done  with 
people  who  are  not  formally  aware  of  Bayes  rule,  or  if  they  are, 
they  are  not  provided  with  the  time  to  perform  the  necessary  calcu- 
lations. It  is  an  interesting  question,  in  this  case,  whether  an 
individual's  intuitive,  or  at  least  informal,  notions  about  evidence 
will  lead  him  to  adjust  his  probability  estimates  in  a way  similar 
to  that  that  would  result  from  an  application  of  Bayes  rule.  And 
if  the  answer  to  this  question  is  no,  it  is  of  interest  to  determine 
whether  his  performance  deviates  from  that  of  Bayes  rule  in  con- 
sistent ways. 

Perhaps  the  question  that  has  been  of  greatest  interest  to, 
and  received  most  attention  from,  experimenters  is  whether  hypotheses 
are  more  effectively  evaluated  by  having  decision  makers  estimate 
posterior  probabilities,  p(n|D),  directly  upon  acquiring  incoming 
data,  or  to  have  them  estimate  conditional  probabilities,  p(D|H), 
and  then  to  use  these  estimates  to  update  the  posterior  probabilities 
with  the  use  of  Bayes  rule.  Much  of  the  evidence  favors  the  con- 
clusion that  hypotheses  are  evaluated  more  efficiently  when  the 
latter  approach  is  taken,  that  is,  when  humans  make  estimates  of 
p (D | H)  and  these  estimates  are  used  along  with  Bayes  rule  to  cal- 
culate estimates  of  p(H|D).  Although  the  directional  effects  of 
data  on  posterior  probability  estimates  produced  by  humans  are 
similar  to  those  on  estimates  revised  in  accordance  with  Bayes  rule, 
the  magnitudes  of  the  effects  tend  to  be  smaller  in  the  former  case. 
In  particular,  the  posterior  probabilities  tend  to  obtain  more  ex- 
treme values  and  to  reach  asymptote  faster  when  they  are  calculated 
according  to  Bayes  theorem  than  when  they  are  estimated  directly 
by  humans  (Edwards,  Lindman,  & Phillips,  1965;  Howell  s,  Getty,  1 968  ; 
Kaplan  & Newman,  1963;  Peterson  & DuCharme,  1967;  Peterson  &,  Miller, 
1965;  Peterson,  Schneider,  & Miller,  1965;  Phillips  & Edwards,  1966. 
It  appears,  therefore,  that  humans  tend  to  extract  less  information 
from  data  than  the  data  contain;  they  require  more  evidence  than 
does  a Bayesian  process  to  arrive  at  a given  level  of  certainty 
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concerning  which  of  the  competing  hypotheses  is  true.  That  is  one 
of  the  findings  that  has  led  to  the  characterization  of  man  as  a 
"conservative"  Bayesian.  In  other  words,  men  tend  to  underestimate 
high  posterior  probabilities  and  overestimate  low  ones.  A similar, 
but  less  pronounced,  tendency  is  found  when  men  estimate  odds 
rather  than  posterior  probabilities  {Phillips  & Edwards,  1966)  . 

Slovic  and  Lichtenstein  (1971)  refer  to  the  conservatism  of 
man  in  his  use  of  probabilistic  data  as  the  primary  finding  of 
Bayesian  research.  They  review  three  competing  explanations  of 
the  result:  (1)  misperception,  or  misunderstanding,  by  the  subject 
of  the  process  by  which  the  data  are  generated;  (2)  inability  of 
subjects  to  aggregate,  or  put  together,  the  impacts  of  several 
data  to  produce  a single  response;  and  (3)  an  inability,  or  un- 
willingness, to  assign  extreme  odds,  e.g.,  odds  outside  the  range 
of  1:10  to  10:1.  Whether  any  of  these  explanations  is  'dequate 
has  yet  to  be  determined. 

It  was  the  finding  of  conservatism  that  prompted  Edwards  (1963, 

1965)  and  his  colleagues  (Edwards,  Phillips,  Hays,  & Goodman,  1968) 
to  experiment  with  probabilistic  information-processing  systems 
that  use  experts  to  judge  the  likelihoods  of  the  data  reaching  the 
system,  given  each  hypothesis  under  consideration,  and  machines 
to  calculate  posterior  probabilities  on  the  basis  of  these  estimates 
and  the  data. 

Not  all  of  the  evidence  that  is  relevant  to  the  question  favors 
the  conclusion  that  humans  are  invariably  much  better  at  estimating 
p(D |H)  than  p ( H | D) . Southard,  Schum,  and  Briggs  (1964b),  for 
example,  obtained  some  results  that  challenge  the  generality  of  the 
finding  that  humans  tend  to  underestimate  high  posterior  proba- 
bilities, and  overestimate  low  ones.  In  particular,  given  a small 
hypothesis  set  and  a frequentistic  environment,  the  estimates  of 
p { H | P ) produced  by  humans  were  close  to,  and  sometimes  more  extreme 
than,  those  produced  by  Bayesian  methods.  Other  studies,  several 
from  the  same  laboratory,  have  also  yielded  results  that  question 
the  validity  of  the  general  conclusion  that  better  decisions  result 
when  values  of  p ( H | D ) are  derived  by  applying  Bayes  rule  to  men's 
estimates  of  p(D|H)  (Schum,  Goldstein,  & Southard,  1966;  Ilov/ell,  1967; 
Kaplan  & Newman,  1966;  Southard,  Schum,  & Briggs,  1964a).  Often 
even  when  evidence  of  conservatism  has  been  found,  the  degree  to 
which  the  human's  estimate  of  p(H|D)  has  differed  from  an  estimate 
produced  by  Bayes  rule  has  been  very  slight  (Peterson  & Phillips, 

1966;  Schum,  Southard,  4.  Wombolt,  1969). 

These  findings  do  not  permit  one  to  conclude  that  estimates 
of  p (II  | D ) are  never  better  when  derived  from  estimates  of  p(D|H) 
than  when  produced  directly,  but  they  do  rail  into  question  the 
opposite  notion,  namely  that  of  the  invariable  superiority  of  the  ' 

indirect  approach.  Moreover,  they  suggest  that  the  direction  that 
research  should  take  is  that  of  determining  the  conditions  under 
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which  each  approach  is  warranted.  Schum,  Goldstein,  and 
Southard  (1966)  present  some  data,  for  example,  that  suggest  that 
estimates  of  p(H|D)  that  are  produced  directly  are  more  adversely 
affected  by  degradation  in  the  fidelity  of  the  incoming  information 
than  are  those  that  are  derived  from  estimates  of  p(D|H). 

Another  finding  that  is  relevant  to  the  question  of  man’s 
capabilities  as  a Bayesian  hypothesis  evaluator  is  that  evidence 
that  tends  to  confirm  a favored  hypothesis  may  be  given  more 
credence  than  evidence  that  tends  to  disconfirm  it  (Brody,  1965; 
Geller  & Pitz,  1968;  Pitz,  Downing,  & Reinhold,  1967;  Slovic, 
1966).  This  finding  raises  the  more  general  question  of  whether 
a vested  interest  in  a decision  outcome  impairs  one's  ability  to 
evaluate  data  objectively.  If  it  is  the  case,  as  Bacon  (1955) 
long  ago  suggested,  that  "what  a man  had  rather  were  true,  that 
he  more  readily  believes,"  at  least  one  of  Savage's  basic 
rules  for  the  application  of  decision  theory  is  gene  'illy  violated. 

The  possibility  that  an  individual's  preferences  among 
hypotheses  may  impair  his  ability  to  evaluate  them  in  an  unbiased 
way  is  closely  related  to  the  finding  that  people  tend  to  be 
reluctant  to  change  a decision  once  it  has  been  made  (see  Section 
4.3). 


Each  of  these  tendencies---consurvatism,  partiality,  and 
perseverativeness — has  been  viewed  as  a fault,  or  as  evidence 
that  man  applies  data  to  the  evaluation  of  hypotheses  in  an 
inefficient  way.  And,  in  the  context  of  most  laboratory  experi- 
ments in  which  it  has  been  observed,  it  undoubtedly  is.  These 
tendencies  may  sometimes  be  less  patently  unjustifiable  outside 
the  laboratory,  however.  An  insistence  on  havinq  compelling 
evidence  before  changing  an  established  opinion  may  have  a 
stabilizing  effect  that  is  not  altogether  bad.  Many  opinions 
are  formed  slowly  over  a period  ot  years,  and  all  the  factors 
that  may  have  contributed  to  their  formation  cannot  always  be 
recalled  at  will.  The  individual  who  is  quick  to  change  an 
opinion  every  time  he  encounters  an  argument  that  he  cannot 
immediately  refute  may  find  himself  constantly  shifting  from  one 
position  to  another,  always  a proponent  of  the  view  that  he  last 
heard  capably  expounded. 

Hypothesis  evaluation  has  been  studied  more  than  most  aspects 
of  decision  making  in  the  laboratory.  This  is  due  in  part  to  the 
existence  of  a simple  prescriptive  model  (Bayes  rule)  for  per- 
forming this  task,  given  an  appropriately  structured  problem,  and 
in  part  to  the  fact  that  it  lends  itself  to  laboratory  exploration 
more  readily  than  some  of  the  other  decision-making  functions. 

Much  has  been  learned  about  man's  capabilities  and  limitations 
in  applying  evidence  to  the  resolution  of  uncertainties  about  the 
various  aspects  of  a decision  situation.  Much  remains  to  be 
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determined,  however.  Among  several  issues  that  deserve  further 
study  are  the  following:  the  possibility  that  information-display 

formats  and  response  techniques  may  influence  subjective  proba- 
bility estimates  that  are  obtained  (Damas,  Goodman,  & Peterson, 

1972;  Herman,  Ornstein,  & Bahrick,  1964);  the  apparent  lack  of 
understanding  of  how  to  combine  probabilities  arising  from  inde- 
pendent sources  of  information  (Fleming,  1970)  ; the  possibility 
that  the  weight  that  one  attaches  to  data  may  depend  on  when  those 
data  occur  during  the  hypothesis-evaluation  process  (Chenzoff, 
Crittendon,  Flores,  Frances,  Mackworth,  & Tolcott,  1960;  Dale, 

1966;  Peterson  & DuCharme,  1967);  and  the  possibility  that  one's 
ability  to  deal  with  uncertainty  in  a conflict  situation  may  de- 
pend on  whether  one  is  operating  with  an  advantage  or  a disadvan- 
tage with  respect  to  one's  opponent  (Sidorsky  & Simoneau,  1970). 

8.6.11  Bayesian  Hypothesis  Evaluation  and  Training 

One  way  to  interpret  some  of  the  results  that  have  boon  de- 
scribed above — for  example,  the  finding  that  men  often  extract 
less  information  from  data  than  does  a Bayesian  aggregator — is 
to  see  them  as  indications  that  man's  intuitive  notions  concerning 
the  uses  of  evidence  are  not  entirely  consistent  with  the  implica- 
tions of  Bayes  rule.  Perhaps  the  thing  to  do,  if  this  is  the  case, 
is  to  disabuse  would-be  decision  makers  of  those  faulty  intuitions. 

Such  a task  might  be  appn.  ached  in  two  ways.  On  the  one 
hand  is  the  cognitive  approach  of  teaching  the  decision  maker  about 
Bayes  rule  and  its  implications.  An  alternative  possibility  is 
to  expose  the  decision  maker  to  a variety  of  situations,  in  which 
his  behavior  is  evaluated  and  immediate  feedback  is  provided  to 
him  concerning  the  way  in  which  it  departs  from  optimality,  if  it 
does.  This  is  the  behavior-shaping  approach;  in  essence,  it  is 
aimed  at  modifying  one's  intuitions  without  necessarily  providing 
an  intellectual  understanding  of  how  optimality  is  defined.  These 
two  approaches  are  not  mutually  exclusive,  of  course,  and  it  seems 
reasonable  to  assume  that  a training  program  would  be  more  likely 
to  be  effective  if  it  used  both.  That  is  to  say,  the  decision 
maker  should  probably  be  given  a good  understanding  of  the  notion 
of  inverse  probability  and  how  Bayes  rule  aggregates  data;  and  he 
should  also  be  provided  with  considerable  practice  in  attempting 
to  apply  the  rule  in  situations  that  are  sufficiently  well-struc- 
tured that  his  performance  can  be  evaluated  and  compared  to  an 
objective  criterion  of  optimality.  The  selection  of  training 
scenai'ios  should  put  special  emphasis  on  those  situations  for 
which  man's  intuitions  have  been  shown  to  be  most  misleading,  e.g., 
especially  small  or  especially  large  levels  of  a priori  uncertainty 
and  situations  in  which  the  direction  of  evidence  changes  after 
a tentative  decision  has  been  reached. 
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The  results  of  some  studies  have  indicated  that  such  training 
can  be  at  least  partially  effective.  Fleming  (1970),  for  example, 
explored  the  question  of  the  effectiveness  of  feedback  concerning 
the  outcome  of  a selected  actiun  in  improving  the  decision  maker's 
performance  on  subsequent  decision  tasks.  The  context  of  the 
study  was  a simulated  tactical  decision-making  situation.  Subjects 
were  required  to  combine  probabilistic  data  from  three  independent 
sources  in  order  to  arrive  at  an  estimate  of  the  relative  likeli- 
hood of  attack  on  each  of  three  ships.  Initially,  subjects  demon- 
strated an  ignorance  of  the  proper  combining  rule  (multiplication) 
and  were  conservative  in  estimating  the  overall  probabilities  of 
attack.  The  investigator  concluded  that  these  data-aggregation 
and  probability-estimation  tasks  should  be  automated.  He  also 
shewed,  however,  that,  although  the  subjects  were  unable  to  gen- 
erate the  correct  probabilities  on  the  basis  of  feedback,  they  did 
revise  their  estimates  over  the  course  of  trials  in  such  a way 
as  to  correct  for  conservatism  (apparently  by  adding  a constant) . 

Other  investigators  have  also  shown  that  experience  in  estimating 
posterior  probabilities  can  produce  behavior  which,  if  not  optimal, 
is  more  nearly  so  than  before  the  training  began  (Edwards,  1967; 
Hoffman  & Peterson,  1972;  Southard,  Schum,  & Bridges,  1964b). 

Such  studies  establish  that  certain  aspects  of  hypothesis  evalua- 
tion, in  particular  posterior  probability  estimation,  can  be  im- 

i proved  somewhat  as  a result  of  practice.  What  they  do  not  indicate, 

however,  is  how  much  can  be  expected  of  training  or  how  the  train- 
ing should  be  done  in  order  to  obtain  optimal  results. 

Another  issue  that  relates  to  training  involves  the  question 
of  how  well  people  can  make  the  p(D|H)  judgments  that  they  are 
required  to  make  in  some  Bayesian  systems.  It  seems  to  be  generally 
assumed  that  people  have  less  trouble  making  these  judgments  than 
they  do  making  judgments  of  p(iijD).  In  at  least  one  study,  how- 
ever, this  was  not  the  case.  Bowen,  Feehrer,  Nickerson,  Spooner, 
and  Triggs  (1971)  encountered  a fairly  strong  resistance  on  the 
part  of  experienced  military  intelligence  officers  to  the  idea  of 
making  judgments  of  the  sort:  "If  it  is  assumed  that  'Attack'  is 

the  enemy  commander's  course  of  action,  what  is  the  probability 
that  one  will  observe  the  traditional  indication  'Massing  of  Tanks'?" 
These  investigators  pointed  out  that  the  "generally  negative  re- 
action to  the  possibility  of  estimating  probabilities  of  the  type 
that  would  be  required  in  a Bayesian  system  must  be  tempered  by 
the  fact  that  the  participants  were  not  familiar  with  the  concept 
of  Bayesian  inference  and  had  not  been  trained  to  make  the  required 
judgments"  (p.  103).  There  is,  therefore,  the  question  of  the 
degree  to  which  training  in  Bayesian  analysis  would  be  effective 
in  overcoming  the  relatively  strong  preferences  that  some  decision 
makers  seem  to  have  for  estimating  posterior  probabilities  them- 
selves . 

( 
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Several  other  questions  concerning  man's  capabilities  as  they 
apply  to  hypothesis  evaluation  have  been  noted  above.  These  ques- 
tions have  arisen  because  of  the  results  of  experimental  studies. 

They  are  questions  which,  for  the  most  part,  have  not  yet  been 
adequately  answered.  The  questions,  in  most  cases,  suggest  some 
limitation  or  deficiency  in  man's  hypothesis-evaluation  skills. 

To  the  extent  that  these  limitations  or  deficiencies  arc  demon- 
strated by  further  research  to  be  genuine,  they  represent  challenges 
to  designers  of  training  programs.  If  it  is  the  case,  for  example, 
that  probability  estimations  are  sensitive  to  the  format  in  which 
information  is  displayed  or  the  mode  in  which  the  response  is  given, 
as  some  studies  have  indicated,  the  question  is  whether  such  effects 
can  be  eliminated  by  training.  if  they  cannot  be,  then  the  need 
to  be  restrictive  with  respect  to  ».  isplay  formats  and  response 
modes  is  so  much  the  greater.  Or,  to  take  another  example,  if  the 
way  one  applies  data  to  the  evaluation  of  a hypothesis  is  different 
for  a favored  hypothesis  than  for  an  unfavored  one,  as  >ther  studies 
have  suggested,  this  constitutes  another  challenge  to  training. 

Can  one  be  trained  to  apply  data  to  all  possible  hypotheses  in  an 
unbiased  way  without  regard  for  his  preferences?  Similar  questions 
concerning  the  potential  effectiveness  of  training  can  be  raised 
concerning  each  of  the  other  limitations  and  deficiencies  that 
have  been  noted.  More  research  will  be  required  in  order  to  answer 
these  questions. 

8 • 7 The  Measurement  of  Subjective  Probability 

Throughout  this  report  we  have  made  frequent  reference  to 
subjective  probabilities,  and  it  has  been  tacitly  assumed  that 
such  things  can  be  accurately  measured.  In  fact,  how  to  assure 
accuracy  in  measurements  of  this  quantity  has  been  a question  of 
some  interest.  The  problem  is  a problem  because  of  the  fact  that 
the  probabilities  that  one  obtains  may  depend  on  the  way  in  which 
they  are  obtained;  or  as  Toda  (1963)  puts  it,  subjective  proba- 
bility is  essentially  defined  by  the  measuring  technique  that  is 
used.  Toda  further  suggests  several  criteria  that  such  a measu- 
ring technique  should  satisfy:  "First,  the  logical  nature  of  the 
task  presented  to  the  subject  should  be  thoroughly  understood  by 
the  experimenter,  and,  hopefully,  by  an  intelligent  subject. 

Second,  the  task  should  involve  well-defined  payoffs  to  the  subject. 
Third,  the  task  should  be  so  structured  that  it  is  to  the  disad- 
vantage of  a subject  to  respond  in  a m inner  inconsistent  with  his 
expectations.  Fourth,  since  our  interest  in  measuring  subjective 
probability  is  related  to  its  use  in  decision  theory,  the  measure- 
ment technique  should  not  be  inconsistent  with  decision  theory"  (p.  1) 

The  third  of  these  criteria  is  perhaps  the  most  subtle,  and 
has  received  the  greatest  amount  of  attention.  stated  in  other 
terms,  the  requirement  is  that  it  be  in  the  subject's  best  interest 
to  state  his  probability  estimates  honestly.  That  this  can  be  a 
problem  may  be  illustrated  by  a simple  example  of  a situation  in 
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which  the  requirement  is  not  met.  Consider  the  case  ot  a student 
taking  a multiple-choice  examination.  Suppose  he  has  been  in- 
structed that  in  answering  each  question  ho  is  to  assign  a number 
to  each  of  the  alternatives  associated  with  that  question  in  such 
a way  as  to  reflect  his  estimation  of  the  probability  that  that 
alternative  is  the  correct  one.  When  he  is  very  certain  of  which 
alternative  is  correct,  all  of  the  numbers  except  one  will  be  zero; 
when  he  is  less  than  100%  ct . tain,  however,  he  would  assign  non- 
zero numbers  to  more  than  one  alternative.  Suppose  further  that 
the  score  that  he  is  to  receive  for  any  given  question  is  s me 
linear  function  of  the  ratio  of  the  number  placed  on  the  correct 
alternative  to  the  sum  of  the  numbers  used  on  all  the  alternatives 
associated  with  that  question.  Given  this  scoring  rule,  the  student 
should  not  distribute  numbers  in  accordance  with  his  true  estima- 
tion of  the  probabilities;  instead,  he  should  put  zeros  on  all 
the  alternatives  except  the  one  that  he  considers  most  likely, 
even  if  he  is  not  very  certain  that  that  alternative  Is  indeed  the 
correct  one. 

This  is  easily  seen  by  considering  a two-alternative  case. 
Suppose  that  t’  . student  really  thinks  that  the  chances  are  7 in 
10  in  favor  of  A being  the  correct  alternative.  If  he  is  honest, 
then  he  wilJ  assign  7/10,  of  whatever  points  he  is  going  to  use, 
on  alternative  A and  3/10  on  B.  Given  our  scoring  rule,  and  assu- 
ming that  our  hypothetical  student  assigns  numbers  to  the  two 
alternatives  in  the  ratio  of  7 to  3,  then  the  two  values  that  las 
score  may  assume  are  7/10  and  3/10.  Moreover,  from  the  student's 
point  of  view,  the  probability  of  getting  a score  of  7/10  is  7/10 
(i.e.,  the  probability  that  A is  correct),  and  the  probability  ot 
getting  a score  of  3/10  is  3/10.  Thus,  the  subjectively  expected 
value  of  his  score  is  (7/10)^  + (3/10) 2 = .58.  But  suppose  that 
our  student  were  a gambler,  and  decided  to  put  all  his  chances  on 
the  alternative  that  he  considered  most  likely  to  be  correct.  Now 
the  two  values  that  h.is  score  can  assume  are  1 and  0,  and  the  ex- 
pected value  of  his  scor  ■ (assuming  that  he  really  believes  that 
A's  chances  are  7 in  10,  rather  than  10  in  10,  as  his  answer  would 
indicate)  is  7/10  x 1 + 3/10  x 0 - .70.  Thus,  whereas  the  student 
was  instructed  to  assign  numbers  to  alternatives  in  accordance  with 
his  judgment  of  the  likelihood  of  their  being  correct,  the  scoring 
rule  is  such  that  he  can  expect  to  obtain  a higher  score  by  ignor- 
ing the  instructions  than  by  followinq  them. 

A scoring  rule  that  is  to  satisfy  Toda's  "honesty  is  the  best 
policy"  requirement  must  have  what  has  been  referred  to  as  a 
"matching  property."  In  formal  terms,  the  matching  property  may 
be  stated  as  follows:  Suppose  that  a subject  reports  n non-negative 

values,  n 

r^,  r2,  ...r^,  X > 0,  presumably  to  reflect  the 

subjective  probabilities  that  he  associates  with  alternative 
possibilities,  x^,  x2,  ...xn.  Assume  a discrete  subjective 
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probability  distribution,  p..,  p? , ...p  , that  represents  the 
subject's  true  probability  estimates  rSgarding  x..  , x„,  . . .x  . 

Letting  P,  R and  X represent,  respectively,  the  vectors  a 
(Px»  P2'  C r1«  r2,  •••rn)  and  (x^  x2,  . . .xn)  , 

and  W(R,  X)  the  payoff  to  the  subject,  given  the  response  vector  R 
and  the  probability  vector  X,  the  matching  property  is  realized  by 
any  payoff  function  for  hich  the  following  statement  is  true: 

The  response  vector,  R,  maximizes  the  subjectively  expected  payoff 
E[W(R,  X)),  if  and  only  if  R = kP,  k being  a scalar  constant. 

That  is  to  say,  a payoff  scheme,  or  a scoring  rule,  has  the  matching 
property  if  and  only  if  the  subject  maximizes  his  subjectively 
expected  payoff  when  the  weights  that  he  assigns  to  the  possibilities 
differ  from  his  true  subjective  pi obabilities  at  most  by  the  same 
multiplicative  factor.  Note  that  when  the  relationship  R = kP 
does  hold,  the  calculation  of  odds  will  be  the  same  whether  based 
on  R or  on  P. 

Subjective-probability  measurement  procedures  and  response 
scoring  techniques  that  make  use  of  functions  that  ha  this  matching 
property  have  been  referred  to  as  "admissible  probabi  ty  measures" 
(Shuford,  Albert,  & Massengill,  1966),  and  "proper  sco^inq"  rules 
(Winkler  b Murphy,  1968) . Several  functions  with  the  matching 
property  have  been  defined  and  investigated,  amonq  them  the  "loga- 
rithmic loss"  (Good,  1952;  Toda,  1963),  the  "quadratic  loss" 

(Brier,  1950;  deFinetti,  1962;  Toda,  1963;  van  Naerssen,  1962, 
and  the  "spherical  gain"  (Toda,  1963;  Roby,  1964,  1965). 

8.7.1  The  Logarithmic  Loss  Function 

The  logarithmic  loss  function  is  unique  among  these  functions 
in  its  exclusive  dependence  on  the  value  of  the  component  of  I< 
that  is  assigned  to  the  correct  alternative.  It  is  not  affected 
by  how  numbers  are  distributed  over  the  other  components  of  R. 

The  function  is  given  by 

n 

W ( R | x . ) = k log  r.  - X r.  (35) 

L ~ 1 1 j = l ^ 

where  k is  a positive  constant,  and  (r|x.)  is  read  "response 
vector  R,  given  that  is  the  correct  alternative.  The  subjec- 
tively expected  payoff 7 given  this  function,  is 

n 

E (W  ) = kXp . log  r.  - X r.  (36) 

L 1 1 j=l  3 

which  is  maximized  when  r.  - p., 

Max  E(Wl)  - kXpi  log  p±  - 1 (Toda,  1963).  (37) 
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Because  the  maximum  subjectively  expected  value  is  negative--hence 
its  designation  as  a "loss"  function — a constant  is  often  added 
to  the  function  in  order  to  make  the  payoff  positive.  Also,  because 
the  function  becomes  -“at  r.  = 0,  a truncated  version  of  it  is 
usually  employed  in  practic^. 

8.7.2  The  Quadratic  Loss  Function 


The  quadratic  loss  function  is  given  by 


WQ(R|xi) 


Z r 2 + (1  - r )2 

k^i  K 1 


(38) 


when  the  number  of  alternatives  is  greater  than  two,  and  by 

W (R|x£)  = -r  2 , k ^ i 


(39) 


for  the  two-alternative  case.  This  function  is  negative  in  the 
two-alternative  case  (although  not  necessarily  when  the  number  of 
alternatives  is  greater  than  two) , so,  as  in  the  case  of  the 
logarithmic  loss,  a constant  is  often  added  to  the  function  to 
assure  a positive  payoff. 

8.7.3  The  Spherical-Gain  Function 

The  spherical-gain  function,  which  has  been  elaborated  by 
Roby  (1965)  will  be  considered  in  somewhat  more  detail,  because 
it  has  some  useful  properties  that  the  other  rules  do  not  have, 
and  a particularly  elegant  geometrical  representation  as  well. 

The  payoff  function  is  given  by 


W (R  x.) 
s — ' 1 


(40) 


For  a proof  that  Ws  is  maximized  only  when  R - kP  see  Shuford, 
Albert,  and  Massengill  (1966).  A reference  to  the  example  that  was 
used  earlier  should  be  sufficient  to  make  the  assertion  plausible. 
Consider  again  the  two-alternative  examination  item  for  which  a 
student  thinks  the  chances  are  7 in  10  in  favor  of  alternative  A. 
Recall  that  if  his  score  is  a linear  function  of  the  proportion 
of  points  he  assigned  to  the  correct  alternative,  then  his  best 
strategy  is  to  put  zero  on  every  alternative  except  the  one  he 
considers  most  likely  to  be  correct;  in  which  case,  his  expected 
score  would  be  .70.  To  see  that  this  is  not  true  in  the  case  of 
the  spherical  gain  scoring  rule,  note  that  if  the  student  puts 
all  his  stakes,  say  n points,  on  alternative  A,  his  expected  score 
will  be: 
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7/10  x (n//n2+02)  + 3/10  x (0//rT2+02)  = .70  . 

If,  however,  he  weights  the  alternatives  in  accordance  with  his 
judgment  of  what  the  chances  really  are,  his  expected  score  will 
be : 

7/10  x (7//72  + 32)  + 3/10  x (3. / /P  + 32 ) = .76  . 


It  should  be  noted  that  the  procedure  permits  the  student  to 
assign  weights  to  the  various  alternatives  in  any  way  he  sees  fit. 
There  might  appear  to  be  some  advantage  in  forcing  the  numbers 
assigned  to  the  alternatives  for  a given  item  to  add  to  one, 
inasmuch  as  they  could  then  be  interpreted  directly  as  probability 
estimates.  The  student  could  be  instructed  to  make  his  assign- 
ments so  that  they  would  indeed  add  to  one;  however,  this  is  an 
unnecessary  demand  since  the  score  is  unaffected  by  a i.ianao  of 
scale.  Moreover,  if  we  wish  to  treat  the  assignments  as  proba- 
bility estimates,  as  we  shall  in  what  follows,  we  can  easily 
normalize  them  by  simply  dividing  each  assigned  number  by  the  sum 
of  the  numbers  associated  with  that  question.  When  this  is  done, 
and  each  of  the  original  numbers  is  replaced  with  the  resulting 
quotient,  then  each  of  the  resulting  numbers  will  be  referred  to 
as  a probability  estimate,  and  the  collection  of  numbers  associated 
with  a given  item  as  a probability  vector. 


A nice  feature  of  the  spherical  gain  scoring  rule  is  that  it 
provides  an  easy  and  intuitively  meaningful  way  of  distinguishing 
between  one's  confidence  in  the  truth  of  a particular  hypothesis 
(or  correctness  of  a test  item)  and  one's  general  degree  of 
"resolution"  with  respect  to  tlte  overall  decision  space  (or  to  the 
whole  test  item).  Roby  defined,  as  a "resolution  index," 


RI 


n 


(41) 


where  RI  represents  an  individual's  confidence  in  his  answer. 
Equation  41  is  simply  the  denominator  of  equation  40  after  the 
latter  has  been  normalized. 

As  in  the  case  of  W , the  maximum  value  of  RI  is  1.  It  should 
be  clear  that  RI  = 1 only  if  r_:  = 1 for  one  value  of  j and  0 for 
all  others.  That  is  to  say,  in  keeping  with  our  intuitive  notions 
about  how  an  index  of  confidence  should ' behave,  it  assumes  its 
maximum  value  when  one  has  put  all  his  chances  on  a single  alter- 
native. (Note  that  whether  that  alternative  is  correct  or 
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incorrect  is  irrelevant  to  this  measure — as  it  should  be.)  Unlike 
W , RI  cannot  assume  the  value  0.  Its  minimum  value  depends  on  the 
nSmber  of  alternative  hypotheses  under  consideration,  or--in  the 
case  of  the  examination  example--tl  2 number  of  candidate  answers 
associated  with  a question.  It  is  obtained  when  _ 1 

ri  = ^ 

for  all  j;  that  is,  the  index  gets  its  lowest  value  when  the  same 
number  is  assigned  to  every  alternative.  Again,  this  is  consistent 
with  our  intuitive  ideas  about  confidence.  The  fact  that  the 
minimum  value  of  the  index  depends  on  the  number  of  alternatives  is 
also  in  keeping  with  our  intuitions  about  how  a measure  of  confidence 
should  behave:  one  should  have  less  confidence  in  a guess  among 

three  equally  likely  alternatives  than  in  a guess  between  two  of 
them. 

8.7.4  Implementation  of  Admissible  Probability  Mease  es 

One  of  the  practical  difficulties  in  applying  scoring  rules 
with  the  matching  property  is  that  of  providing  subjects  with 
intuitively  meaningful  information  concerning  the  implications  of 
their  probability  assignments  vis-a-vis  the  scores  that  could  re- 
sult fi  them.  It  is  clear  that  simply  providing  individuals  with 
formal  expressions  of  the  rules  will  not  suffice,  at  least  for 
those  who  are  not  mathematically  trained.  One  approach  to  this 
problem  is  that  of  illustrating  the  implications  of  any  given  rule 
with  concrete  examples  that  make  clear  the  advantages  of  being 
honest.  Another,  and  perhaps  preferred,  approach  is  that  of  pro- 
viding the  individual  with  an  explicit  representation  of  the  payoff 
that  he  would  receive,  gi’ren  the  truth  of  any  specific  hypothesis 
and  the  way  in  which  he  had  distributed  probabilities  over  the 
alternatives . 

Organist  and  Shuford  designed  a paper  and  pencil  procedure 
for  providing  this  information  in  the  case  of  the  logarithmic 
loss  function  (Baker,  1964,-  Organist,  1964  ; Organist  & Shuford, 

1964).  Shuford  (1967)  and  Baker  (1968)  have  also  described  a 
computer-based  technique  for  providing  similar  information  in  a 
dynamic  way.  in  this  case  the  alternatives  open  to  the  decision 
maker  are  shown  on  a computer-driven  display.  Associated  with 
each  alternative  is  a line, the  length  of  which  represents  the 
user's  ralacive  confidence  that  that  alternative  is  the  correct 
one.  The  user  adjusts  the  lengths  of  the  lines  by  means  of  a 
light  pen-  When  the  length  of  one  line  is  changed  by  the  user, 
the  lengths  of  all  the  others  are  adjusted  by  the  computer  so  as 
to  constrain  the  sum  of  the  lengths  to  add  to  one  at  all  times. 

Also  displayed  with  each  line  is  a number  which  indicates  to  the 
user  what  his  payoff  wou Ld  be  if  the  alternative  associated  with 
that  line  were  the  correct  one.  The  logaritlimic  loss  function 
determined  the  relationship  between  the  number  representing 
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potential  payoffs  and  the  lengths  of  the  lines  in  the  app] icntions 
of  the  system  that  are  reported.  However,  the  relationship  could 
as  well  have  been  determined  by  any  other  scoring  rule  of  interest. 

8.7.5  The  Efficacy  of  Admissible  Probability  Measures 

It  would  seem  clear  from  the  mathematics  of  the  situation 
that  scoring  rules  that  have  the  matching  property  should  be  used 
in  preference  to  these  that  do  not.  It  has  not  been  clearly  de- 
monstrated empirically,  however,  that  subjects  tend  to  behave 
dishonestly  if  such  rules  are  not  used,  or  that  their  responses 
are  free  of  biases  if  they  are  {Aczel  & Pfanzagl,  1966;  Jensen  & 

Peterson,  1973;  Samet,  1971;  Schum,  Goldstein,  Howell,  & Southard, 

1967) . Moreover,  it  is  also  apparent  that  many  of  the  probability 
estimation  situations  of  interest  to  investigators  of  decision 
making  are  situations  in  which  the  only  scoring  rules  that  are 
operative  are  those  that  are  imposed  by  nature.  The  s tuations 
in  which  subjective  probabilities  are  of  greatest  practical  sig- 
nificance tend  to  be  those  in  which  the  payoffs  are  beyond  the 
experimenter's  control. 

8.7.6  Subjective  Probability  Measurement  and  Training 

One  question  of  interest  that  relates  to  training  research  •, 

is  whether  individuals  who  have  had  experience  in  making  probability 
judgments  in  controlled  situations  with  scoring  rules  that  have 
the  matching  property  are  more  effective  at  judging  probabilities 
in  real-world  situations  than  those  who  have  had  experience  at 
estimating  probabilities  but  have  not  been  exposed  to  matching- 
property  rules.  As  has  already' been  noted,  some  investigators 
have  advocated  the  use  of  experts  to  estimate  conditional  proba- 

'.o  be  used  in  Bayesian  aggregation  systems  (Bond  & Rigney, 

1966;  Edwards,  1965b).  Often,  however,  it  is  not  possible  to 
determine  how  accurately  such  estimates  are  made.  If  one  had  an 
objective  indicant  of  the  probabilities  of  interest  that  was  inde- 
pendent of  the  experts'  judgments,  it  would  not  be  necessary  to 
get  the  judgments.  It  would  be  of  interest,  however,  to  determine 
whether  the  behavior  of  experts  on  such  tasks  would  be  sensitive 
to  the  type  of  experience  they  had  had  in  estimating  probabilities 
in  controlled  situations,  and  in  particular  to  their  exposure  to 
admissible  or  inadmissible  probability  measurement  techniques. 

Savage  (1971)  has  suggested  the  early  introduction  of  admissible 
scoring  rules  to  children,  along  with  careful  training  in  the 
assessment  of  opinion  strength,  could  have  the  salutory  effect  of 
dispelling  some  of  the  myths  concerning  the  relationships  between 
certainty,  belief  and  action — e.g.,  the  idea  that  one  should  speak 
and  act  as  though  certain,  even  when  one  is  not,  nd  the  notion 
that  weakly  held  opinions  are  worthless — that  are  fostered  by 
conventional  educational  testing  methods. 

* 


126 


NAVTRAEQUIPCEN  73-C-0128-1 


Scoring  rules  with  the  matching  property  answer  to  one  aspect 
of  the  problem  of  measuring  subjective  probabilities,  namely  that 
of  structuring  the  situation  so  that  honesty  in  reporting  is  the 
best  policy.  There  are  other  aspects  of  the  problem,  however,  that 
are  not  so  readily  solved.  Expressions  of  certitude  have  been  shown 
to  vary  considerably  as  a function  of  the  way  in  which  they  are 
reported  (Samet,  1971)  and  of  the  context  in  which  they  are  obtained 
(Nickerson  & McGoldrick,  1963).  Typically,  when  subjects  are  asked 
to  rate  their  confidence  in  their  own  performance  on  a perceptual 
or  cognitive  task,  a positive  correlation  between  these  variables 
is  found--conf idence  is  highest  when  performance  is  best--but  the 
strength  of  the  relationship  is  not  always  great,  and  the  signifi- 
cance of  a given  confidence  rating  depends  on  thi  situation  and  the 
person  making  it  (Andrews  & Ringel,  1964;  Nickerson  & McGoldrick, 
1965).  A fundamental  question  that  is  raised  by  these  results  is 
whether  such  factors  affect  certitude  itself,  or  only  its  expression. 
A further  question  is  whether  such  variability--whate'rer  its  basis-- 
can  be  eliminated,  or  at  least  significantly  reduced,  as  a result 
of  appropriate  training. 

H . 8 The  Use  of  Unreliable  Data 

In  the  foregoing  discussions  of  the  use  of  Bayes  rule,  it  has 
been  tacitly  assumed  that  the  data  used  in  estimating  conditional 
or  posterior  probabilities  had  been  accurately  observed  and  re- 
ported. In  the  chips-in-urn  illustrations,  for  example,  it  was 
assumed  that  one  could  examine  a chip  and  determine  its  color  easily, 
or  that  someone  else  determined  the  color  and  reported  it  accurately. 
Thus,  the  decision  maker  could  operate  with  complete  confidence  in 
the  data  at  his  disposal.  In  the  real  world  of  decision  making, 
things  often  are  not  this  way.  frequently,  the  observation  or  the 
reporting  of  events  is  faulty,  and  the  decision  maker  is  obliged 
to  take  this  fact  into  account  when  making  use  of  the  data  that  he 
has  obtained. 

We  naturally  assume  that  data  from  a trustworthy  source  will 
be  more  useful  to  a decision  maker  than  will  data  from  a source 
that  has  not  inspired  confidence  in  the  past.  The  use  of  an 
explicit  reliability  rating  procedure  for  intelligence  reports  by 
NATO  army  forces  (see  Section  5.2)  is  based  on  such  an  assumption. 
Few  attempts  have  been  made,  however,  either  to  validate  this 
assumption  or  to  determine  in  a quantitative  way  exactly  how  con- 
fidence in  a data  source  does  affect  the  way  in  which  the  data 
from  that  source  are  applied  to  a decision  problem. 

8.8.1  Prescriptive  Approaches 

One  class  of  prescriptive  models  for  taking  into  account  the 
reliability  of  data  has  come  to  be  known  as  "cascaded"  or  "multi- 
stage" inference,  suggesting  a process  of  hypothesis  evaluation 
( that  involves  more  than  one  step.  Schum  and  DuCharme  (1971)  point 

t out  that  research  on  cascaded  inference  has  been  focused  on  two 
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situations:  one,  in  which  the  observer  or  reporter  of  an  event  ex- 

presses his  degree  of  certainty  concerning  whether  or  not  the  event 
actually  occurred  (see,  for  example,  Dodson,  1961;  Gettys  & Willke, 
1959;  Steiger  & Gettys,  1972) , and  a second,  in  which  the  report  of 
an  event  is  made  without  qualification  by  a source  that  is  known  to 
be  less  than  perfectly  reliable  (see,  for  example,  Schum  & DuCharme, 
1971;  Schum,  DuCharme,  & DePitts,  1973;  Snapper  & Fryback,  1971). 

In  both  cases,  attention  has  been  confined  primarily  to 
relatively  simple  situations,  e.g ■ , those  in  which  (1)  the  decision 
maker's  task  is  to  determine  which  of  two  hypotheses,  or  li^,  is 
true,  and  observations  have  only  two  possible  outcomes,  anu  D2, 
and  (2)  the  reliability  of  a report  is  independent  of  the 
hypotheses  that  are  being  considered,  that  is  to  say,  event 
and  Dy  are  neither  more  nor  less  likely  to  be  confused  under 
H.  than  under  H2> 


Dodson  (1961)  considered  the  situation  in  which  an  observer 
is  not  certain  which  of  two  mutually  exclusive  events,  and  D„, 
has  occurred,  but  may  be  able  to  make  a probability  or  certitude 
judgment  on  the  question.  He  suggested  that  in  order  to  calculate 
the  posterior  probability  of  a hypothesis  in  this  case,  one  should 
calculate  its  value,  given  each  of  the  possible  events,  and  then 
take  a weighted  sum  of  these  values,  the  weights  being  the  proba- 
bilities that  the  observer  attaches  to  the  event  possibilities. 
Given  only  two  possible  data,  the  calculation  may  be  represented 
as  follows: 


UH.|D)  = iMD1)p(Hi|DJ)  + tMD2)p(Hi|D2) 


(42) 


where  £(H.|D)  is  the  posterior  probability  of  H . , taking  the  ob- 
server’s incertaincy  into  account,  and  ^(D.)  is^he  probability 
that  the  observer  attaches  to  the  possibility  that  he  has  observed 
event  D..  More  generally,  given  n possible  events  and  the  assump- 
tion th^t  the  observer  can  attach  a probability  to  each  of  them, 
the  formula  might  be  written  as 

n 

Z *(D.  ) p (II  ■ |D.)  . (43) 

j=l  3 3 


(,(Hi|D)  - 


,ft 


Substituting  the  Bayesian  formula  for  ptHjjDj)  we  have 


D)  = 


Z*(D.) 

j J 


p(Pj  |Hi)p(Hi) 


7TB- 


Hi)p(Hi) 


(44) 


Using  Dodson's  work  as  a point  of  departure,  Gettys  and  Willke 
(1969)  and  Schum  and  DuCharme  (1971)  gave  the  process  of  dealing 
with  unreliable  data  a more  explicit  two-stage  form.  The  following 
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discussion  roughly  follows  Schum  and  DuCharme.  These  writers  fo- 
cused on  the  case  in  which  a decision  maker  obtains  information 
about  data  via  a source  that  sometimes  incorrectly  reports  what 
has  actually  occurred,  (It  is  irrelevant  to  this  discussion  whether 
the  source's  errors  are  assumed  to  be  errors  of  observation  or 
errors  of  report.)  For  each  of  the  decision  problems  that  were 
analyzed  there  were  two  hypotheses,  H.  and  H0,  two  possible  data 
events,  D.  and  D , and  two  possible  reports  By  the  source,  d and  d2. 
What  one  wants  to  determine  is  p(H^|dj).* 

According  to  Bayes  rule 


p(di|Hi)p(Hi) 

P(dj) 


(45) 


The  problem  then  is  to  determine  p (d  . ) H . ) . If  the  pi  bability  of 
a datum  conditional  on  a hypothesis,  p(5.  |H.),  and  the  probability 
of  a report,  conditional  jointly  on  a hypothesis  and  a datum 
(p  (d  j | H • flD.)  , are  known,  then  the  probability  of  a report,  condi- 
tional in  a hypothesis  p(d-|Hi)  can  be  easily  calculated.  The 
relationship  is  given  by  3 


p(dj|Hi)  * i:P(Dk|lIi)p(dj  |n±n  Dk)  , ‘ (46) 
k 

a graphical  representation  of  which  is  shown  in  figure  26.  When, 
by  assumption,  the  reliability  of  a report  is  independent  of  the 
hypothesis  that  is  being  considered, 


p(dj  |H..nDk)  = p(dj  |Dk) 
so,  in  effect, 

/ 

P(dj  |Hi)  = ^P(Dk|Hi)p(dj  |Dk)  . 
k 


(47) 


(48) 


Schum  and  DuCharme  refer  to 
P(d,|Hi) 

A ■ 573^ 


(49) 


*Our  notation  differs  slightly  from  that  used  by  Schum  and  DuCharme: 
we  use  and  to  represent  the  two  possible  data  events,  whereas 

they  used  D and  5,  and  we_use  d^  and  d£  to  represent  reported  data 
whereas  they  used  D*  and  D* . 
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Tn^nhe  ^adji?S^d  ^ikelihood  ratio,"  the  likelihood  ratio  that  takes 
into  account  the  degree  of  reliability  of  the  source.  Of  course, 

A reduces  to  the  standard  likelihood  ratio  when  the  source  is 
assumed  to  give  completely  reliable  reports,  inasmuch  as,  in  this 


P<dj|Dk) 


1 for  j = k 
0 for  j / k , 


The  way  to  deal  with  the  problem  of  unreliable  data  then,  accor- 
ding to  Schum  and  DuCharme  is  with  a two-step  process:  (1)  adjust 
the  diagnosticity  of  the  data  by  determining  p (d . | D.  ) or  A,  and 
(2)  apply  the  adjusted  data  to  revise  the  distribution  of  pro- 
babilities over  the  hypotheses  via  Bayes  rule. 


what  one  must  be  able  to  measure  or  estimate  in  order  to 
use  this  procedure  are  p(D|H),  the  standard  conditioi.il  probabilities 
of  Uayes  theorem,  and  p{d|D),  the  indices  of  source  reliability. 

Schum  and  DuCharme  define  source  reliability  in  terms  of 

r - p(di|D.) 


PfO^ility  that  the  source  will  report  a data  event  accurately 
They  distinguish  four  different  decision  -cases" ktoi,  of V' 
tain  symmetries  and  asymmetries  involving  p(d|n)  and  p(d]D)  and 

Sll  °£  their  Prescription  fSr 

With  unreliability  foi  each  case.  The  cases  that  they  distinguish 


Case  I:  Symmetric  p(D|H):  Symmetric  p (d | D) 

p(Dll,I1>  “ P(D2|H2);  p(d1|D1)  - p(d2|D2). 

Case  II:  Asymmetric  p(D|li);  Symmetric  p(d|D) 

P(Dll,!l)  * (D2|H2);  ptdjDj)  = p(d2|D2). 


Case  III:  Symmetric  p (D | H ) : Asymmetric  p(d|D) 

P(Dll»l)  = P(D2|H2);  p(d1|D1)  yt  P<d2|D2). 


Case  IV: 


Asymmetric  p(D|H);  Asymmetric  p(d|D) 

^P(D2|H2);  p(d1|D1)  / P(d2|D2) 


m 
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In  order  to  avoid  the  use  of  conditional  probability  nota- 
tion, Schum  and  DuCharme  introduced  the  following  notational 
equivalencies: 


For  symmetric  p(D|H): 


p(Di 


1-p  ^ P (Dj 


Hi) 

Hi),  j^i 


For  asymmetric  p(D|H):  p.^ 

P2 

1-Pi 

i-P2 


p(dl 

Pf^i 

P(D2 

p(d2 


Hl) 

H2) 

Hi) 

h2)  . 


For  symmetric  p(d|D):  r = p(di 

1-r  = p (d j 

For  asymmetric  p(d|0):  r^  5 p (d^ 

r0  = P(d_ 


1-r, 


l--r , 


p(d, 

P(d. 


Di) 

D£)  r jf^i 


V 

d2) 

Dl} 

°2> 


Letting  the  subscripts  on  A represent  symmetry  or  asymmetry 
with  respect  to  p(D|H)  and  p(d|D),  respectively,  and  making  the 
above  substitutions  into  equation  (46),  as  appropriate,  we  obtain 
Schum  and  DuCharme' s expressions  for  the  prescribed  use  of  data 
of  imperfect,  but  known,  reliability  for  each  of  the  four  cases 
they  considered.  All  adjusted  likelihood  rstiOo  represent 

P (dr  I Hx) 
p(^1|h2)  • 
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A = Pr+(l-p)  > ' -r)_ 
s,s  (1-p) r+p (i-r) 


(50) 


Case  II:  A 


Pxr+ (1-Pl) (1-r) 
a,s  = p2r+(l-p2) (1-r) 


(51) 


or,  equivalently. 


a , s 


pj+k 


(52) 


where 


k = 


1-r 

2r-l 


, r^ . 5 


(53) 


pr.+(l-p) (l-r_) 

Case  III:  A - . £_ 

s,a  (l-p)r  +p(l-r2) 


(54) 


or  if  pj41  and  r2/l, 
A. 


where 


s,a 


zl 

rfp] 

+ 1 

c+ 

(55) 


c — 1 


1-r. 


(56) 


Case  IV: 


Plrl+(1_P1) (1_r2) 
a,a  = P2rx+ (l-p2) (l-r2) 


(57) 


or  if  riy(l-r2) 


where 


p.+b 

\ = Jl. 

a, a P2+b 


(58) 


(59) 
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It  follows  from  the  definitions  of  unadjusted  and  adjusted 
likelihood  ratio  that  the  latter  is  always  closer  to  unity  than 
the  former  and  that  the  difference  between  them  increases  as 
reliability,  r,  is  decreased  from  1.0  to  0.5*  (except  when  the 
data  are  completely  uninformative  to  begin  with  and  the  unadjusted 
ratio  is  1} . This  is  consistent  with  the  intuitively  compelling 
requirement  that  the  less  reliable  the  data,  the  loss  diagnostic 
impact  it  should  have.**  Figure  27  shows  how  the  difference  be- 
tween unadjusted  and  adjusted  likelihood  ratio  grows  as  reliability 
is  decreased,  and  how  the  adjusted  ratio  goes  to  1 for  r - .5,  for 
the  case  in  which  both  p(D|H)  and  r are  symmetric,  i.e.,  Schum  and 
DuCharme's  Case  I. 

Figure  27  also  illustrates  the  fact  that  the  greater  the 
diagnostic  impact  of  data  (when  reported  by  a completely  reliable 
source) , the  greater  is  the  effect  of  a decrease  in  reliability  of 
a report.  This  also  is  an  intuitively  reasonable  relal ' onship: 
the  less  informative  data  are  to  begin  with,  the  less  there  is  to 
lose  if  they  are  reported  unreliably.  What.  i.  less  intuitively 
apparent  is  the  fact  that  even  a very  small  decrease  in  reliability 
may  have  an  extremely  large  effect  on  likelihood  ratio  if  the  un- 
adjusted ratio  is  very  high.  Schum  and  DuCharme  (1971)  point  out, 
for  example,  that  in  Case  I,  if  a datum  with  an  unadjusted  likeli- 
hood ratio  of  100,000  is  reported  by  a source  with  a reliability 
of  .99,  the  adjusted  ratio  is  reduced  by  about  four  orders  of 
magnitude  to  slightly  less  than  .99. 

The  results  of  Schum  and  DuCharme's  analysis  bear  on  issues 
relating  to  the  design  of  information  and  decision-making  systems 
and  on  the  role  of  humans  therein.  For  example,  they  show  that  under 
Case  I conditions,  there  is  a reasonably  straightforward  tradeoff 


•Decreasing  r below  0.5  has  the  effect  of  making  the  adjusted 
likelihood  ratio  depart  again  from  unity,  although  it  still  remains 
closer  to  unity  than  does  the  unadjusted  ratio.  In  other  words, 
decreasing  the  reliability  quotient  below  0.5  increases  the  diag- 
nosticity  of  the  data,  but  in  support  of  the  alternative  hypothesis. 
This  is  consistent  with  the  idea  that  a source  that  is  consistently 
wrong  may  be  very  informative;  one  need  only  interpret  its  report 
as  evidence  of  the  opposite  of  what  it  says.  In  this  discussion, 
we  will  confine  our  attention  to  the  case  in  which  1.0  > r > 0.5. 

**Schum  and  DuCharme  (1971)  point  out,  however,  that  when  the 
reliability  of  report  is  not  independent  of  which  hypothesis  is 
being  considered,  it  is  possible  for  A to  differ  more  from  1 than 
does  L;  that  is  to  say,  it  is  possible  for  a decrease  in  relia- 
bility, in  that  case,  to  increase  the  diagnosticity  of  data. 
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Figure  27.  Adjusted  Likelihood  ratio  (A)  as  a functio 
of  data  reliability  (r)  for  several  values 
< of  unadjusted  likelihood  ratio  (L) , for  Scl 

and  DuCharme ' s Case  I 


1 

l 
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between  p(n|H)  and  p(d|D).  And  the  tradeoff  is  such  that  if  one 
wants  to  increase  the  diagnostic  impact  of  information  flowing 
through  a system,  and  the  costs  of  increasing  the  conditionals 
p (D  | li)  and  p(d|D)  are  equal,  one  should  increase  the  smaller  of 
the  two. 

Also,  the  analyses  show  that  in  Cases  II  and  IV  A is  dependent 
upon  specific  values  of  p.  and  p rather  than  on  their  ratio.  Thus, 
despite  the  fact  that  earlier  results  have  suggested  that  people 
find  it  easier  to  make  judgments  of  likelihood  ratios  than  of 
conditional  probabilities,  there  may  be  situations  in  which  esti- 
mates of  the  latter  should  be  required. 

8.8.2  Some  Empirical  Results 

The  models  developed  toy  Schum  and  DuCharme  are  prescriptive, 
providing  for  optimal  adjustment  of  the  likelihood  rati,  under 
conditions  in  which  data  are  reported  with  less  than  totux,  but 
known,  reliability.  We  now  turn  to  a consideration  of  several 
studies  aimed  at  comparing  actual  performance  against  that  prescribed 
by  these  models.  in  the  next  section  we  then  present  a brief 
account  of  some  descriptive  models  suggested  by  these  results. 

All  experiments  and  models  that  will  be  considered  in  these  sections 
address  situations  where  input  to  the  decision  process  is  an  event 
or  set  of  events  reported  by  a sing] e unreliable  source. 

Snapper  and  I’ryback  (1971)  present  the  results  of  a study  in 
which  the  experimenter  reported  to  the  subject  with  (symmetric) 
reliabilities  of  1.0,  0.9,  and  0.7  the  outcomes  of  events  concep- 
tually similar  to  the  draws  of  chips  from  an  urn.  The  probabilities 
of  events  conditional  on  hypotheses,  p(D^|H  ),  p(D~|lI.)  and 
p (D . | H 2 ) , p<D2|H),  were,  respectively,  as  follows:  Ta)0.33,  0.67 

and^O.B?,  0.3j;  Tb)  0.80,  0.20  and  0.60,  0.40;  (c)  0.90,  0.10  and 

0.45,  0.55;  (d)  0.25,  0.75  and  0.75,  0.25.  For  conditions  in 

which  the  experimenter's  reliability  was  equal  to  unify,  only  (a) 
and  (b)  were  used.  Subjects  were  required  to  indicate  which  of 
the  hypotheses  they  considered  more  likely  as  a result  of  the  ex- 
perimenter's report,  and  how  much  more  likely  than  the  alternative 
hypothesis  they  considered  it  to  be.  Under  conditions  of  unit 
reliability,  subjects'  estimates  corresponded  very  closely  to  the 
actual  likelihood  ratio,  but  when  reliability  was  less  than  unity 
they  represented  slight  underestimates  of  the  impact  of  the  least 
diagnostic  reports  and  overestimates  of  the  impact  of  the  remaining 
reports.  The  extent  of  this  overestimation,  moreover,  increased 
with  the  magnitude  of  A. 

Johnson  (1974;  see  also  Johnson,  Cavanagh,  Spooner,  & Samet, 

1973)  has  utilized  a similar  task  and  response  structure  to 
study  the  effects  of  four  different  variables  on  cascaded 
inference:  (i)  sample  size,  the  number  of  draws  which  underlay 

a cumulative  outcome  report  (e.g.,  "three  reds,  two  blacks"); 
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(2)  data  generator  diagnosticity,  the  relative  composition  of  red 
and  black  chips  in  the  urn;  (3)  sample  d iagnosticity , the  diagnos- 
tic value  defined  by  the  difference  between  total  numbers  of  red 
and  of  black  chips  underlying  a report;  and  (4)  source  reliability. 
Posterior-odds  estimates  that  were  obtained  in  this  case  were  found 
to  be  sensitive  to  different  values  of  sample  size,  data  generator 
diagnosticity  and  source  reliability,  tending  to  decrease  as  the 
values  of  these  variables  decreased.  When  the  report  was  known 
to  be  perfectly  reliable,  estimates  of  posterior  odds  were  generally 
more  conservative  than  those  computed  from  Bayes  theorem;  however, 
they  became  progressively  less  conservative  and  approached  optimal 
values  at  intermediate  levels  of  reliability  (.8-.  7),  and  then 
became  slightly  excessive  at  lower  levels  (.7-. 6). 

The  diagnosticity  and  reliability  of  reported  events  were 
manipulated  by  Yousef f and  Peterson  (1973)  in  such  a way  that  the 
value  of  A in  a situation  requiring  multistage  inference  was 
equal  to  the  value  of  the  standard  likelihood  ratio  in  a single- 
stage  situation  (that  is,  one  with  report  reliability  equal  to 
unity).  Subjects'  estimates  tended  to  be  conservative  for  high 
values,  both  of  A and  of  L,  as  compared  with  the  Bayesian  model, 
and  tended  to  be  excessive  for  low  values.  The  odds  estimated 
in  conditions  requiring  multistage  inference  were  consistently 
greater  than  those  estimated  in  single-stage  conditions  and,  as 
a result,  were  excessive  compared  to  the  optimal  odds  over  a 
wider  range  than  were  single-stage  odds. 

Schum,  DuCharme,  and  DePitts  (1971)  conducted  a study  in  which 
the  accuracy  of  subjects'  own  observations  of  the  number  of  Xs 
contained  in  tachistoscopically  presented  4x4  matrices  of  Xs 
and  Os  constituted  the  reliability  levels.  Subjects  were  required 
to  estimate  the  relative  likelihood  of  two  possible  hypotheses 
relating  to  the  data  generator  after  each  of  five  stimulus 
presentations.  Under  conditions  in  which  sufficient  time  was 
available  for  totally  accurate  observation  of  the  stimuli,  esti- 
mates became  increasingly  conservative  compared  to  the  optimal 
model  as  the  diagnosticity  of  each  observed  event  and  the  infe- 
rential consistency  over  a set  of  five  events  increased.  Under 
conditions  in  which  insufficient  time  was  available  for  accurate 
observations,  the  subjects'  estimates  were  generally  close  to 
optimal  or  slightly  excessive  when  diagnosticity  and  consistency 
were  high,  and  became  more  conservative  as  either  of  these  para- 
meters assumed  lower  values.  In  a second  phase  of  this  same  study, 
subjects  estimated  directly  the  diagnosticity  of  data  based  on 
brief  observations  of  each  slide.  Compared  with  the  optimal  model, 
such  estimates  become  increasingly  excessive  as  L increased. 

The  results  of  these  studies  establish  that  the  behavior  of 
decision  makers  is  indeed  influenced  by  the  degree  of  reliability 
of  their  data  sources.  They  also  demonstrate,  however,  that 
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performance  tends  not  to  be  consistent  with  that  prescribed  by 
the  formally  appropriate  rule  for  adjusting  data  diaqnosticity . 
Further,  performance  with  unreliable  data  often  differs  in  one 
important  respect  from  that  that  has  been  observed  in  classical 
Bayesian  inference  situations  in  which  events  ai e observed,  or 
reported,  with  perfect  accuracy.  Whereas  in  the  latter  case  the 
decision  maker's  estimates,  though  revised  in  the  appropriate  di- 
rection, tend  to  be  conservative  as  compared  with  Bayes  theorem, 
his  estimates  based  on  less- than-completely-reliable  data  fre- 
quently appear  to  be  excessive  as  compared  with  Schum  and  Du- 
Charme's  prescription  for  optimality.  Because  the  value  of  A as 
defined  by  the  Schum  and  DuCharme  model,  in  effect,  makes  an  adjust- 
ment in  the  direction  of  increasing  conservatism  (produces  a value 
closer  to  unity),  the  two  eff ects--conservatism  vis-a-vis  I,  and 
excessiveness  vis-a-vis  A— can  offset  each  other,  if  conditions 
are  just  right. 

8.8.3  Some  Attempts  to  Develop  Descriptive  Models  of 
Cascaded  Inference 


As  we  have  noted,  the  model  developed  by  Schum  and  DuCharme 
(1971)  for  dealing  with  unreliable  data  prescribes  two  stops,  or 
stages:  in  the  first  stage,  the  nominal  diagnosticity  of  a datum 

is  discounted  to  reflect  the  degree  of  reliabilicy  of  the  source, 
and  in  the  second,  the  adjusted  datum  is  applied  to  the  hypotheses 
under  evaluation  in  accordance  with  Bayes  rule.  If  hypotheses 
are  being  evaluated  in  terms  of  odds,  the  process  can  bo  repre- 
sented as  follows: 

P (d  | H i ) 

Stage  1:  compute  A = 

Stage  2:  compute  As 

where  A represents  the  adjusted  likelihood  ratio,  and  u ^ and  u 
represent  the  posterior  and  prior  odds,  respectively. 

The  experimental  results  that  were  reviewed  briefly  above 
make  it  clear  that  people  typically  do  not  behave  in  accordance 
with  this  prescription.  Several  investigators  have  attempted  to 
develop  models  that  do  describe  behavior. 

The  results  obtained  by  Snapper  and  Frybuck  (1971),  using 
symmetric  reliabilities,  suggest  that  in  dealing  with  unreliable 
data,  decision  makers  estimate  the  likelihood  ratio  as  though  the 
data  were  completely  reliable,  adjust  the  resulting  ratio  by  mul- 
tiplying it  by  the  reliability  quotient,  and  then  apply  the  ad- 
justed ratio  to  the  calculation  of  posterior  odds.  The  process 
may  be  represented  as  follows: 


138 


NAVTRAEQUIPCEN  73-0128-1 


Stage  1:  compute  ft  = rh 
Stage  2:  compute  ft.  = Kil  . 

Snapper  and  Fryback  note  that  the  optimal  rule  for  the  first  stage 
of  the  process  i-s  -neither  apparent  nor  intuitive,  whereas  the  rule 
that  seemed  td  describe  fche-i  behavior  of  their  subjects  has  some 
intuitive  appeal  and  is  easily  applied.  Its  use  leads,  however,  to 
subjective  estimates  of  likelihood  ratio  that  ar,e  excessive'  in 
comparison  with  those  prescribed  by  A.  That  is  to  say,  ft  leads  to 
overestimation  of  thn  diagnostic  impact  of  a given  (unreliable) 
datum. 

The  extent  to  which  ft  overestima  :es  A — for  Schum  and  Ducharme's 
Case  I — is  illustrated  in  figures  28  and  29.  Figure  28  shows 
both  A and  A as  functions  of  r for  several  values  of  u;  figure 
29  shows  the  ratio  A/ A for  the  same  conditions.  The  figures  show 
only  cases  for  which  L >_  l and  r >.  .5.  For  L < 1,  one  obtains  the 
same  relationships  by  simply  expressing  the  likelihood  ratio 
re  rather  than  H.  re  H~.  The  case  of  r <.5  is  of  little 
interest  for  the  reason  explained  in  the  first  footnote  on  page  134. 
As  may  be  seen  from  these  figures,  the  degree  to  which  ft  over- 
estimates A depends  both  on  L and  r:  for  given  L it  tends  to 

vary  inversely  with  r (given  r jj.5)  and  for  given  r it  increases 
sharply  with  L. 

Gettys,  Kelly,  and  Peterson  (1973)  have  suggested  a model 
that  is  slightly  different  from  that  of  Snapper  and  Fryback.  It 
assumes  that  the  decision  maker  estimates  posterior  odds  on  the 
assumption  that  the  most  likely  event  is  true,  and  then  adjusts 
the  odds  to  reflect  the  reliability  of  the  data  source.  This 
model  may  be  represented  as  follows: 

Stage  1:  compute  = Li2g 

Stage  2:  compute  = rS'2 

It  is  apparent  that  although  the  process  by  which  the  pos- 
terior odds  are  estimated  differs  in  the  two  cases,  the  results 
are  precisely  he  same.  Edwards  and  Phillips  (1966)  have  presented 
evidence,  how  :r,  suggesting  that  the  way  in  which  people  estimate 
posterior  odds  may  be  better  described  by 

= LC  S2  Q , (60) 

where  c varies  with  L,  than  by  the  prescribed  g - Lii  . Funaro 
(1974)  points  out  that  the  models  of  Snapper  ana  Fryoack,  and  of 
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Gettys,  Kelly,  and  Peterson  make  different  predictions  if  the  odds 
are  calculated  according  to  Phillips  and  Edwards'  expression.  The 
former  leads  to 

^ = (rL)Cfi0  (61) 

and  the  latter  to 

= rLCnQ  • (62) 


Funaro  (1974)  has  recently  attempted  to  evaluate  the  pre- 
dictive power  of  Snapper  and  Fryback's  model  and  of  that  of  Gettys, 
Kelly,  and  Peterson,  using  both  L and  Lc  as  unadjusted  likelihood 
ratios  in  each  case.  A symmetric  p (D | H) --symmetric  r task  (Schum 
and  DuCharme,  Case  I)  was  used.  Subjects  were  require*,  to  revise 
odds'  estimates  under  both  single-stage  (perfect-source  : ^liability 
assumed)  and  cascaded-inference  conditions.  Values  of  c were 
estimated  separately  for  individual  subjects  from  data  obtained 
in  the  single-stage  conditions. 

The  results  were  not  consistent  with  any  of  the  mode] s de- 
scribed above.  They  were  predicted  best  by  another  model  that 
Funaro  proposed.  This  model,  which  Funaro  called  the  empirical 
model,  assumes  that  subjects  accurately  estimate  A,  and  then  apply 
this  estimate  to  the  revision  of  odds  with  the  same  degree  of  ef- 
fectiveness, or  ineffectiveness,  with  which  they  apply  f,  in  single- 
stage  tasks.  The  conclusion  appears  to  be  inconsistent  with  the 
results  of  Youssef  and  Peterson  (1973)  who  found  that  odds's  es- 
timates made  under  cascaded  conditions  were  consistently  excessive 
relative  to  those  made  in  single-stage  tasks,  given  A = L. 

Funaro  notes,  however,  that  subjects  in  his  experiment  could 
have  acquired  a direct  appreciation  for  A from  the  proportion 
of  successes  and  failures  in  > series  of  reports  obtained 
from  the  source  during  the  course  of  the  experiment.  (In  a sym- 
metrical p ( D | H ) chips-in-urn  situation,  one  can  unambiguously 
define  a "success"  as  the  drawing--or  in  this  case  reporting--of 
a chip  of  the  predominant  color.)  To  the  extent  that  subjects 
were  able  to  develop  a direct  awareness  of  A,  the  effect  would 
have  been  to  eliminate  the  need  for  a two-staqe  process  and  to 
transform  the  task  into  the  simpler  problem  of  revising  odds  on 
the  basis  of  totally  reliable  data.  The  suggestion  is  an  eminently 
plausible  one  and  the  possibility  that  this  is  in  fact  the  way 
unreliable  data  are  often  accommodated  in  real-world  situations 
deserves  further  study. 

8 . 9 Some  Comments  on  Bayesian  Hypothesis  Eva luation 

Inasmuch  . s the  Bayesian  approach  to  hypothesis  evaluation 
has  received  so  much  attention  by  decision  theorists  and  investi- 
gators of  decision  making,  it  seems  important  to  consider  some  of 
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the  limitations  of  this  approach.  To  point  out  limitations  is  not, 
of  course,  to  deny  that  the  approach  has  merit.  Among  its  advantages 
aire  the  fact  that  it  places  minimal  demands  on  memory  because  data 
can  be  discarded  after  being  used  to  update  the  distribution  of 
probabilities  over  hypotheses,  the  fact  that  it  provides  a means 
of  aggregating  qualitatively  different  data  in  a meaningful  way  and 
the  fact  that  the  procedure  for  applying  data  to  the  evaluation  of 
hypotheses  automatically  weights  data  in  terms  of  their  relevance 
to  the  hypotheses  being  evaluated.  It  is  precisely  because  the  ap- 
proach does  work  well  in  some  contexts  that  there  is  a danger  of 
uncritically  concluding  that  it  is  appropriate  in  all  cases.  The 
following  observations  are  based  largely  on  a discussion  by  Bowen, 
Nickerson,  Spooner,  and  Triggs  (1970). 

First,  Bayes  rule  itself  applies  to  only  one  of  the  several 
aspects  of  decision  making;  namely,  hypothesis  evaluation  or,  more 
precisely,  the  resolution  of  uncertainty  concerning  tue  state  of 
the  world.  Whatever  its  efficacy  for  that  particular  task,  it  is 
not  the  grand  solution  to  the  problem  of  decision  making. 

Second,  application  of  Bayes  rule  requires  that  the  decision 
problem  be  structured  in  a very  precise  way.  In  particular,  it 
requires  that  one's  uncertainty  about  the  state  of  the  world  be 
represented  as  an  exhaustive  set  of  mutually  exclusive  possibilities. 
It  does  not,  however,  provide  any  help  in  identifying  these  possi- 
bilities . 

Third,  the  requirement  for  an  exhaustive  set  of  mutually  ex- 
clusive hypotheses  about  the  state  of  the  world  precludes  the 
possibility  of  expanding . one ' s hypothesis  space  as  one  goes  along. 

It  clearly  often  is  the  case,  in  real-life  situations,  that  new 
hypotheses  are  suggested  by  incoming  data.  That  is  to  say,  obser- 
vations may  have  the  effect  not  only  of  modifying  the  credibility 
of  existing  hypotheses,  but  of  suggesting  new  hypotheses  as  well. 

Fourth,  the  fact  that  use  of  Bayes  rule  presupposes  a set  of 
mutually  exclusive  hypotheses  has  another  implication.  By  defini- 
tion, one  and  only  one  of  these  hypotheses  can  be  true;  all  the 
others  must  be  false.  The  probabilities  that  are  associated  with 
these  hypotheses  do  not,  of  course,  represent  their  truth  values, 
but,  rather,  the  decision  maker's  opinion  concerning  their  truth 
or  falsity.  It  was  pointed  out  in  the  preceding  paragraph  that  no 
provision  is  made  for  the  possibility  that  the  hypothesis  set  does 
not  contain  the  true  hypothesis.  It  is  also  the  case  that  provision 
is  not  made  for  the  possibility  that  more  than  one  of  the  hypo- 
theses are  true,  or  that  one  or  more  is  partially  true. 

Fifth,  application  of  Bayes  rule  is  a recursive  process:  each 

time  that  a new  observation  is  to  be  used  to  ujxlate  a posterior 
probability  estimate,  the  posterior  probability  from  the  preceding 
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update  is  used  as  the  prior  probability  for  the  current  update. 
Originally,  however — before  the  first  observation  is  made — the 
prior  probabilities  must  be  estimated,  and  Bayes  rule  does  not 
help  in  this  regard.  Investigators  are  not  entirely  agreed  on  how 
these  prior  probabilities  should  be  assigned- -or  on  what  they  mean. 
It  is  Often  pointed  out  that  how  prior  probabilities  are  assigned 
may  make  little  difference  (provided  values  very  close  to  0 or  1 
are  not  used) , because  the  effect  of  the  initial  values  will  be 
largely  nulled  after  several  observations  have  been  made.  The 
problem  can  be  a significant  one,  however,  when  hypotheses  must  be 
evaluated  on  the  basis  of  relatively  tew  data.  In  such  cases,  the 
initial  prior  probabilities  can  have  a very  strong  effort  on  the 
final  posteriors,  and  thus  the  way  in  which  they  are  assigned  is 
of  considerable  concern. 

Sixth,  the  basic  assumption  that  justifies  a Bayesian  approach 
to  hypothesis  evaluation  is  the  assumption  that  man  is  utter  at 
estimating  p(D|H)  than  at  estimating  p(H|D).  We  have  noted  in 
preceding  sections  some  experimental  evidence  that  tends  to  support 
this  assumption.  We  have  also  noted  some  studies,  however,  that 
have  shown  that  this  result  is  not  always  found.  Moreover,  there 
is  a question  concerning  how  far  the  evidence  that  does  support 
this  assumption  can  be  pushed.  The  only  way  that  one  can  determine 
how  accurately  a man  can  estimate  p ( D | H ) is  to  observe  his  perfor- 
mance in  experimental  situations  in  which  p(D|H)  is  objectively 
defined  or  can  be  determined  empirically.  But,  typically,  in  real- 
life  situations  of  greatest  interest,  p(D|il)  is  not  known,  and 
cannot  be  determined  empirically--which  is  why  is  must  be 
defined  or  can  be  determined  empir ically--which  is  why  it  must  be 
estimated.  The  question  arises  then,  if  it  is  not  known,  how  can 
we  be  sure  that  one's  estimate  of  it  is  accurate?  And  the  answer 
is  that  we  cannot.  Mow  much  confidence  one  should  have  in  the  con- 
clusion that  man  is  better  at  estimating  p ( L>  | H ) than  at  estimating 
p (H | D)  in  real-world  situations  depends  in  large  part  on  the  extent 
to  which  one  is  willing  to  assume  that  what  is  known  about  perfor- 
mance in  laboratory  situations  in  which  p(D|ll)  usually  has  a 
straightforward  relative-frequency  interpretation  is  genera  1 i zable 
to  real-world  situations  in  which  it  does  not. 

Seventh,  Bayes  rule  does  not  provide  the  decision  maker  with 
a criterion  concerning  when  tc  stop  processing  incoming  data  and 
to  make-  a decision.  Inasmuch  as  data  gathering  can  be  costly  in 
terms  ef  both  time  and  money,  it  is  essential  that  any  completely 
adequate  prescriptive?  model  of  decision  making  have  an  explicit 
stopping  rule  to  indicate  when  hypothesis  evaluation  should  be  ter- 
minated and  a decision  made. 
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We  emphasize  that  these  comments  deal  with  limitations  of 
Bayes  rule.  One  might  argue  that  the  observations  are  unnecessary, 
on  the  grounds  that  proponents  of  Bayesian  diagnosis  have  never 
claimed  that  these  limitations  do  not  exist.  It  seems  to  us 
important  to  make  these  limitations  explicit,  however,  because 
they  help  to  place  the  notion  of  Bayesian  decision  making  in 
perspective.  The  idea  of  obtaining  estimates  of  p(D|H)  or  of 
likelihood  ratios  from  humans  and  using  these  estimates  to  update 
posterior  probability  distributions  in  accordance  with  Bayes  theorem 
is  undoubtedly  a reasonable  approach  to  evaluation  in  some  situa- 
tions. It  is  not  always  appropriate  or  practicable,  however,  as 
some  Bayesians  have  been  careful  to  point  out.  Edwards  (1967)  de- 
scribes the  situations  for  which  the  approach  is  most  appropriate 
as  those  that  have  one  or  more  of  the  following  three  characteris- 
tics: "the  input  information  is  fallible,  or  the  relation  of  input 
information  to  output  diagnostic  categories  is  ambiguous  or  uncertain, 
or  the  output  is  required  to  be  in  explicitly  probabilistic  form" 

(p.  71). 
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SECTION  IX 

PREFERENCE  SPECIFICATION 

It  is  generally  assumed  that  a decision  maker  is  not  indif- 
ferent to  which  of  the  various  possible  decision  outcomes  occurs. 

As  we  have  already  noted,  in  some  formal  representations  of  de- 
cision situations,  the  decision  maker's  perferences  with  respect 
to  the  possible  outcomes  are  made  explicit  in  a payoff  matrix. 

The  contents  of  a cell  of  such  a matrix  is  the  worth  to  the  de- 
cision maker  of  the  choice  of  a specific  action-alternative,  given 
the  truth  of  a specific  hypothesis  concerning  the  state  of  the 
world.  The  entire  matrix  presumably  represents  the  situation 
fully:  it  identifies  all  the  decision  maker's  action  alternatives 

as  well  as  all  the  possible  states  of  the  world,  and  shows  for  each 
alternative-state  combination  its  worth  to  the  decision  maker. 

9 . 1 A Difficult  and  Peculiarly  Human  Task 

The  problem  is  how  to  determine  these  worths.  There  are  two 
observations  to  make  in  this  regard.  The  first  is  that  this  task, 
more  than  any  other  associated  with  decision  making,  is  peculiarly 
human.  One  would  expect  that  many  of  the  decision-related  tasks 
that  now  must  be  performed  by  humans  will  in  time  be  performed  by 
computers.  However,  the  specification  of  preferences  for  decision 
outcomes  involves  value  judgments.  To  say  that  one  decision  out- 
come is  better  than,  worth  more  than,  or  preferred  to,  another  is 
to  say  that  it  represents  a greater  good  within  the  context  of  the 
decision  maker's  own  value  system.  Such  judgments  must  come,  at 
least  indirectly,  from  man. 

The  second  observation  is  that  to  specify  one’s  preferences 
objectively  is  not  necessarily  an  easy  thing  for  an  individual  to 
do.  Even  when  all  of  the  action  alternatives  have  been  made  ex- 
plicit and  the  outcome  of  each  possibility  is  known — that  is,  even 
when  uncertainty  is  minima] --the  decision  task  may  still  be  a very 
difficult  one.  This  is  particularly  true  when  the  worths  of  the 
possible  decision  outcomes  are  intangible  or  depend  on  many  factors 
Consider,  for  example,  the  problem  of  choosing  a house  for  purchase 
Even  assuming  that  one  confines  his  attention  to  a few  houses  that 
he  knows  are  available,  and  that  he  has  all  the  information  that 
he  wants  about  each  one,  he  has  the  problem  of  somehow  deriving 
from  many  factors  (purchase  price,  number  of  rooms,  design,  qoneral 
condition,  extras--porch , gaiuge,  storage  space,  extra  baths, 
fireplace — lot  location  and  layout,  distance  from  work,  tax  rate 
in  town,  services  and  public  facilities  in  town)  a common  figure 
of  merit  in  terms  of  which  one  house  can  be  judged  to  be  more  or 
less  preferred  than  another. 

In  military  situations,  the  specification  of  preferences  may 
be  especially  difficult.  It  may  often  happen  that  none  of  the 
possible  decision  outcomes  is  intrinsic  ally  desirable,  and  the 
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decision  maker  may  find  himself  faced  with  the  necessity  of  attempt- 
ing to  choose  the  least  undesirable  one.  The  problem  is  aggravated 
by  the  fact  that  the  assignment  of  preferences  may  necessitate  the 
weighting  of  time,  materiel,  territory  and  human  lives.  One  balks 
at  the  idea  of  trying  to  specify  the  value  of  human  lives  and  that 
of  a piece  of  territory  in  terms  of  a common  metric,  but  this  is 
what  is  done,  at  least  implicitly,  when  a decision  is  made  to  at- 
tempt to  gain  a territorial  objective  when  it  is  known  that  the 
endeavor  is  likely  to  result  in  the  loss  of  a certain  number  of  men. 
Or,  consider  the  private  transportation  system  in  the  United  States. 
The  builders,  users  and  regulators  of  automobiles  and  highways  have 
implicitly  expressed  a preference  for  a system  that  provides  certain 
capabilities  and  conveniences  at  a cost  of  approximately  60,000 
traffic  fatalities  per  year.  One  suspects  that  the  exercise  of 
making  explicit  how  the  various  factors  that  contribute  to  human 
preferences  are  traded  off  against  each  other  in  specific  decision 
situations  would  often  be  revealing  to  decision  maker  themselves, 
who  sometimes  may  have  little  conscious  appreciation,  v ' thout  going 
through  such  an  exercise,  of  how  such  factors  do  combine  to  deter- 
mine their  own  preferences. 

Among  the  eight  aspects  of  decision  making  in  terms  of  which 
this  report  is  organized,  preference  specification  is  one  of  the 
two  (the  other  is  hypothesis  evaluation)  that  have  received  the 
greatest  amount  of  attention  from  philosophers  and  researchers 
alike.  In  the  case  of  decision  making  under  certainty,  the  study 
of  preferences  and  the  study  of  choice  behavior  amount  to  the  same 
thing.  Presumably  one  chooses  what  one  prefers — and  vice  versa— 
if  he  can  know  for  certain  what  the  decision  outcome  will  be. 

9 . 2 Some  Early  Prescriptions  for  Choice 

In  order  to  make  choices  among  alternatives  that  differ  with 
respect  to  several  incommensurate  variables,  one  must,  at  least 
implicitly,  derive  from  the  several  variables  involved  a single 
figure  of  merit  with  respect  to  which  the  alternatives  can  be 
compared.  That  is  to  say,  one  must  be  able  to  decide  that  in  some 
global  sense  Alternative  A is  preferred  to  Alternative  B.  How  this 
is  generally  done  is  not  known;  how  it  should  be  done  is  a matter 
of  some  dispute.  Undoubtedly,  individual  methods  for  dealing  with 
the  problem  range  from  highly  intuitive  impressionistic  approaches 
(I  just  consider  ail  the  factors  and  decide  that  I like  this  com- 
bination better  than  that)  to  formal  quantitative  algorithms. 

Benjamin  Franklin  was  familiar  with  the  problem,  and  his  way 
of  dealing  with  it  is  at  least  of  historical  interest:  "I  cannot, 
for  want  of  sufficient  premises,  advise  you  wha_t  to  determine,  but 
if  you  please  I will  tell  you  how...  My  way  is  to  divide  half  a 
sheet  of  paper  by  a line  into  two  columns;  writing  over  the  one  Pro , 
and  over  the  other  Con.  Then, durine  three  or  four  days'  considera- 
tion, I put  down  under  the  different  heads  short  hints  of  the 
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different  motives,  that  at  different  times  occur  to  me  for  or 
against  the  measure.  When  I have  thus  got  them  all  together  in 
one  view,  I endeavor  to  estimate  the  respective  weights ...[ to ] find 
at  length  where  the  balance  lies...  And,  though  the  weight  of 
reasons  cannot  be  taken  with  the  precision  of  algebraic  quantities, 
yet,  when  each  is  thus  considered,  separately  and  comparatively, 
an  1 the  whole  matter  lies  before  me,  I think  I can  judge  better, 
and  am  less  liable  to  make  a rash  step;  and  in  fact  I have  found 
great  advantage  for  this  kind  of  equation,  in  what  may  he  called 
moral  or  prudential  algebra . " * 

A more  formal  attempt  to  procedurize  choice  behavior  was  made 
at  about  the  same  time  by  the  British  philosopher  and  social  re- 
former, Jeremy  Bentham.  Starting  with  the  basic  premise  that 
choices  should  be  dictated  by  the  extent  to  which  their  outcomes 
augment  or  diminish  the  happiness  of  the  party  or  parties  whose 
interest  is  in  question  (the  "principle  of  utility"),  Bentham 
attempted  to  define  a quasi-quantitative  procedure — a "hedonistic 
calculus" — the  use  of  which  would  assure  that  the  choices  that  are 
made  would  be  consistent  with  this  principle: 

"To  take  an  exact  account  then  of  the  general  tendency 
of  any  act  by  which  the  interests  of  a community  are  affected 
proceed  as  follows.  Begin  with  any  one  person  of  those  whose 
interests  seem  most  immediately  to  be  affected  by  it,  and 
take  an  account: 

(1)  Of  the  value  of  each  distinguishable  pleasure 
which  appears  to  be  produced  by  it  in  the  f irst  instance. 

(2)  Of  the  value  of  each  pain  which  appears  to  be  pro- 
duced by  it  in  the  first  instance. 

(3)  Of  the  value  of  each  pleasure  which  appears  to  be 
produced  by  it  after  the  first.  This  constitutes  the  fecundity 
of  the  first  pleasure  and  the  impurity  of  the  t irst  pain. 

(4)  Of  the  value  of  each  pain  which  appears  to  be  pro- 
duced by  it  after  the  tirst.  This  constitutes  the  fecundity 
of  the  first  pain , and  the  impurity  of  the  first  pleasure. 

(5)  Sum  up  all  the  values  of  all  the  pleasures  on  the  one 
side,  and  those  of  all  the  pains  on  the  other.  The  balance, 
if  it  be  on  the  side  of  pleasure,  will  give  the  good  tendency 


*This  account  of  Franklin's  approach  to  decision  making  was  quoted 
by  Dawes  and  Corrigan  (1974),  who  found  it  in  a letter  from  Franklin 
to  his  friend  Joseph  Priestly,  dated  September  19,  1772. 
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of  the  act  upon  the  whole,  with  respect  to  the  interest:;  of 
that  ind  ividual  person;  it  on  the  side  at  pain,  the  bail 
tendency  of  Tt~  upon  the  whole. 

(6)  Take  an  account  of  the  nuniber  of  persons  whose 
interests  appear  to  be  concerned,  and  repeat  the  above  pro- 
cess with  respect  to  each.  Sum  up  the  numbers  expressive 
of  the  degrees  of  good  tendency  which  the  act  has,  with  re- 
spect to  each  individual  in  regard  to  whom  the  tendency  of 
it  is  good  upon  the  whole;  do  this  again  with  respect  to  each 
individual  in  regard  to  whom  the  tendency  of  it  is  bad  upon 
the  whole.  Take  the  balance;  which,  if  on  the  side  of 
pleasure,  will  give  the  general  good  tendency  of  the  act, 
with  respect  to  the  total  number  or  community  of  individuals 
concerned;  if  on  tho  side  of  pain,  the  general  evil  tendency, 
with  respect  to  the  same  community"  (Bentham,  19T9,  p.  8041  • 

The  value  of  a pleasure  or  pain,  Bentham  assumed,  would  depend 
on  four  factors: 

" (1)  Its  intensity . 

(2)  Its  duration . 

(3;  Its  certainty  or  uncertainly. 

(4)  Its  propinquity  or  remoteness . " 


lientham  did  not  expect  that  the  procedure  he  defined  would  bo 
"strictly  pursued  previously  to  every  moral  judgment,  or  to  every 
legislative  or  judicial  operation";  but  lie  did  content!  that  it 
represented  a model  of  how  judgments  should  be  made,  and  a stan- 
dard against  which  whatever  procedures  are  used  might  be  evaluated. 

Bentham'3  approach  to  ''ho ice  behavior  can  be,  and  has  been, 
criticized  on  philosophical  grounds.  The  principle  of  "the  greatest 
pleasure  for  the  greatest  number"  is  itself  open  to  criticism, 
because  it  appears  to  place  no  limits  on  the  extent  to  which  the 
many  can  prosper  at  the  expense  of  the  few,  provided  only  that  the 
"bottom  line"  of  the  calculation  of  the  net  happiness  is  increased 
in  the  process.  For  our  purposes,  the  important  point  is  the  fact 
that  Bentham  attempted  to  reduce  tile  process  ol  making  choices  to 
a stepwise  procedui 

9.3  Simple  Models  of  Wojrth  Composition 

Al  though  he  used  language  that  suggested  that  lie  bel  Loved 
that  worth  could  be  quantified  and  his  procedure  formalized  as 
a sort  of  calculus  for  computing  the  worth  of  any  given  decision 
outcome,  Bentham  did  not  himself  express  his  notions  in  mathe- 
matical form.  Ilis  conceptualization  of  the  choice  process,  how- 
ever, as  clearly  suqgestive  of  a linear  model  which  expresses  the 
worth  of  a decision  alternative  as  a function  of  the  sum  of  the 
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values  of  the  various  components  of  pleasure  (or  pain)  that  that 
alternative  represents,  weighted  by  the  number  oi:  people  that 
would  be  affected  by  the  decision  outcome.  The  choice  would,  of 
course,  be  the  alternative  with  the  greatest  calculated  worth. 

Implicit  in  Bentham's  prescription  is  the  assumption  that 
the  total  worth  of  a decision  outcome  is  a monotonically  increasing 
function  ol  each  of  the  factors  which  contribute  to  the  worth,  and 
that  the  monotone  character  of  this  relationship  for  any  given 
factor  is  independent  of  the  values  of  the  other  factors.  Yntema 
and  Torgerson  (1961)  have  suggested  that  there  are  probably  many 
practical  choice  situations  in  which  this  is  a valid  assumption, 
for  example,  the  worth  of  a vocational  choice  probably  increases 
monotonically  with  the  attractiveness  to  the  individual  of  the 
work  involved,  whatever  the  status  of  the  other  factors  to  be  con- 
sidered. Yntema  and  Torgerson  present  some  data  that  suggest 
that  when  this  is  the  case,  the  decision  maker’s  choic.  behavior 
can  often  be  matched,  if  not  improved  upon,  by  a selection  algo- 
rithm that  takes  account  only  of  how  worth  relates  to  each  of  the 
factors  individually  and  ignores  the  ways  in  which  the  factors 
interact.  To  develop  such  an  algorithm  it  is  necessary  only  to 
determine  how  worth  varies  with  the  individual  factors.  Several 
ways  of  making  this  determination  are  suggested.  An  important 
point  lor  our  purposes  is  that  the  relationships  of  interest  may 
be  inferred  from  the  behavior  of  the  decision  maker  when  confronted 
with  the  task  of  choosing  between  pairs  of  hypothetical  alternatives 
selected  to  represent  specific  (in  particular,  extreme)  combina- 
tions of  the  relevant  factors. 

Dawes  and  Corrigan  (1974)  have  recently  taken  an  even  stronger 
position  with  respect  to  the  practicality  and  the  validity  of  simple 
linear  decision  algorithms  in  a wide  variety  of  choice  situations. 
They  have  shown  that  if  each  of  the  factors  contributing  to  the 
worth  of  a decision  outcome  has  a conditionally  monotone*  relation- 
ship to  that  worth,  and  the  measurement  oi  these  factors  is  subject 
to  error,  then  not  only  are  decisions  that  are  based  on  weiyhted 
linear  combinations  of  the  factors  likely  tu  be  better  than  those 
made  by  human  decision  makers,  but  in  some  cases  this  is  true  even 
if  the  weights  are  equal  or  randomly  chosen.  Data  from  several 
studies  of  judgmental  and  choice  behavior  are  reviewed  in  support 
of  this  conclusion.  Of  the  situalions  reviewed  by  Dawes  and  Cor- 
rigan, the  only  ones  in  which  a linear  weighting  algorithm  did 
more  poorly  than  a human  decision  maker  were  those  in  which  the 
human's  judgment  was  based  on  information  not  taken  into  account 
by  the  algorithm. 


* A conditionally  monotone  relationship  is  one  that  is  monotone,  or 
can  be  made  monotone  by  a scaling  transformation. 
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9 . 4 The  Problem  of  Identifying  Worth  Components 

The  implication  is  that  if  one  can  identify  the  factors  in 
terms  of  which  worth  is  determined,  one  frequently  can  improve 
significantly  upon  human  judgment  by  application  of  a simple  linear 
model.  The  problem,  according  to  this  view,  is  not  in  the  develop- 
ment of  arcane  mathematical  decision  algorithms,  or  even  in  the 
application  of  complex  weighting  functions  to  a linear  combination 
rule,  but  that  of  identifying  the  dimensions  of  the  choice  space 
and  of  determining  how  these  dimensions  relate,  individually,  to 
the  worth  of  the  possible  decision  outcomes. 

The  danger  in  this  line  of  reasoning  is  that  of  assuming  that 
identification  of  the  factors  in  terms  of  which  judgments  are,  or 
should  be,  made  is  a trivial  task.  As  we  have  already  suggested, 
such  an  assumption  is  almost  certainly  false  for  many,  if  not  most, 
real-life  decision  situations.  Most  people  can  probu  l.y  recall 
choices  that  they  have  made  which  they  realize  in  retro "poet  were 
made  without  consideration  of  some  factor  that  they  would  have 
recognized  as  relevant  and  important  if  only  they  had  thought  to 
think  of  it.  An  individual  buys  a house,  for  example,  and  realizes 
too  late  that  he  failed  to  determine  whether  the  cellar  leaks. 

Had  the  question  occurred  to  ihim,  he  would  have  recognized  it  not 
only  as  a relevant  consideration  but  as  one  that  would  have  figured 
heavily  in  his  assessment  of  the  relative  worths  of  candidate  pur- 
chases. A potentially  important  aid  to  a decision  maker  would  be 
a procedure  that  would  facilitate  the  identification  of  the  dimen- 
sions of  his  choice  space.  Having  determined  the  factors  upon 
which  the  relevant  worths  of  possible  choices  depend,  and  how  these 
factors  relate  functionally  to  worth,  a simple  linear  model  of  the 
type  espoused  by  Yntema  and  Torgerson  might  then  be  used  to  infer 
the  decision  maker's  behavior  in  a choice  situation.  The  experi- 
mental results  reviewed  by  Dawes  and  Corrigan  suggest  that  such  a 
model  might  even  be  used  in  place  of  the  decision  maker  to  effect 
the  choice. 

9 . 5 Studies  of  Choice  Behavior 

In  using  the  choices  of  a human  as  the  standard  against  which 
to  compare  the  performance  of  a model,  one  is  assuming  that  humans 
behave  in  at  least  a consistent,  if  not  an  optimal,  fashion.  Only 
recently  has  the  assumption  that  decision  makers  are  able  to  make 
consistent  choices  among  alternatives  that  differ  on  many  dimen- 
sions without  recourse  to  formal  analytical  procedures  been  tested. 

Slovic  and  Lichtenstein  (1971)  have  reviewed  several  approaches 
that  have  been  taken  to  the  problem  of  describing  how  people  do  in 
fact  make  such  choices.  They  divide  these  approaches  into  two 
major  categories:  those  that  make  use  of  correlational  or  regres- 

sion analysis  or  the  closely  related  analysis  of  variance,  and  those 
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that  make  use  of  Bayes  theorem.  Among  the  nonHaynsian  approaches 
that  are  reviewed  are  the  correlation  model  ol‘  Hoffman  (I960;  1968), 

the  lens  model  of  Brunswik  (1992,  1956),  the  integration  theory  of: 
Anderson  (1968,  1969) , and  the  theory  of  conjoint  measurement  of 
Luce  and  Tukey  (1964)  and  Krantz  and  Tversky  (1971)  . The  objective 
in  all  of  this  work  is  to  discover  and  describe  how  a human  "judge" 
combines  information  concerning  different  attributes  of  a choice 
alternative  to  arrive  at  a judgment  of  its  overall  desirability 
relative  to  the  other  alternatives  among  which  a choice  is  to  be 
made . 


The  results  of  many  o'  the  studies  reviewed  by  Slovie  and 
Lichtenstein (1971)  suggest  that,  although  people  can  make  "wholistic 
evaluations"  (Fischer,  1972),  they  tend  to  focus  their  considera- 
tions or.  less  than  the  full  set  of  dimensions,  and,  as  a conse- 
quence, frequently  ignore  potentially  important  information.  Also, 
there  appears  to  be  a degree  of  random  error  in  the  ev.. 1 nation 
process  which  increases  as  the  decision  maker  attempts  to  consider 
increasing  numbers  of  relevant  attributes  (Hayes,  19  64;  Kariarick, 
Huntington,  a Petersen,  1969;  Kigncy  & Debow,  1966). 

On  the  basis  of  results  obtained  in  his  study  of  job-seeking 
behavior,  Soelberg  (1967)  challenged  the  idea  that  people  generally 
do  make  choices  in  accordance  with  worth-calculation  models  in 
real-world  situations.  In  his  words,  "The  decision  maker  believes 
a priori  that  he  will  make  his  decision  by  weighting  all  relevant 
factors  with  respect  to  each  alternative,  and  then  'add  up  num- 
bers' in  order  t identify  the  best  one.  In  fact,  he  does  not 
generally  do  thi.-,  and  if  he  does,  it  is  done  after  he  has  made 
an  'implicit'  selection  among  alternatives"  (p.  T8).  Soelberg 

draws  a number  of  other  conclusions  from  his  study  which,  in  the 
aggregate,  seem  to  suqqest  that  much  of  the  effort  that  goes  into 
decision  making  is  calculated  to  rational izc — rat  her  than  arrive 
at— a choice.  It's  as  though  the  decision  maker  were  in  cahoots 
with  himself  to  deceive  himself  into  perceiving  his  choices  as 
well-founded  when  in  fact  the  real  basis  for  them  may  be  unknown. 

9 . 6 Procedures  for  Specifying  Worth 

Obviously  people  can— people  do--make  choices  among  multi- 
dimensional stimuli?  the  results  mentioned  above  suggest,  however, 
that  our  ability  to  handle  many  dimensions  simultaneously  in  a 
consistent  and  reliable  way  without  the  aid  of  a formal  procedure 
is  somewhat  limited.  Given  that  the  problem  o«  -ms  to  be  one  ot 
exceeding  man's  ability  to  process  information,  it  is  not  sur- 
prising that  some  of  the  solutions  that  have  been  proposed  take 
the  form  of  ways  of  restructuring  unmanageable  problems  so  as  to 
make  them  into  problems  of  simpler  proportions.  Such  procedures 
are  sometimes  referred  to  as  decomposition  procedures  because 
they  divide  the  task  into  subtasks  that  presumably  are  within  the 
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decision  maker's  information-processing  capabilities.  The  solutions 
t.o  the  subtaskn  are  then  used  as  a basis  for  inducing  a solution  to 
the  original  problem. 

These  procedures  typically  involve  a number  of  steps  (e.g., 
Fischer,  1972)  such  as  specifying  the  alternatives  to  be  compared, 
specifying  the  dimensions  or  factors  with  respect  to  which  the 
alternatives  are  to  be  compared,  assessing  the  worth  of  each  alter- 
native with  respect  to  each  dimension,  and  combining  the  results  of 
the  dimension-by-dimension  assessments  into  some  overall  indicant 
of  worth  for  each  alternative.  The  first  of  these  steps  has  not 
been  a focus  of  attention  in  studies  of  preference  specification; 
the  alternatives  usually  are  provided.  In  the  real  world,  identi- 
fying these  alternatives  can  be  a nontrivial  problem,  but  it  is 
perhaps  better  thought  of  as  a problem  of  information  gathering 
than  one  of  specifying  preferences.  The  second  step  tlso  has  not 
received  much  research  attention. 

A great  deal  of  attention  has  been  given  to  the  third  of  the 
steps  mentioned  by  Fischer  (e.g..  Beck  McClintock,  1967;  Coombs, 

1967;  Fischer  f.  Peterson,  1972;  Fishburn,  _967;  Hammond,  1967  ; Huber, 
Sahney,  & Ford,  1969;  Luce  & Tukey,  1964;  MacCrimmon,  1968;  Miller, 
Kaplan,  & Edwards,  1967;  Raiffa,  1968).  Numerous  techniques  have 
been  proposed  and  studied  for  assessing  the  worths  of  alternatives 
with  respect  to  individual  dimensions  or  factors.  These  techniques 
range  from  simple,  qualitative  pair-comparison  procedures  that 
yield  ordinally  scaled  preferences  to  relatively  complex  methods 
for  deriving  ratio  scales  for  interdependent  factors. 

MacCrimmon  (1968)  has  reviewed  several  prescriptive  techniques 
for  choosing  among  alternatives  that  differ  with  respect  to  multiple 
factors.  The  techniques  that  he  considers  are  discussed  under  the 
following  rubrics:  (1)  dominance,  (2)  satisficing,  (3)  maximin,  (4) 

maximax,  (5)  lexicography,  (6)  additive  weighting,  (7)  effectiveness 
index,  (8)  utility  theory,  (9)  tradeoffs,  and  (10)  nonmetric  scaling. 
In  each  case,  he  describes  the  necessary  assumptions  and  information 
requirements,  and  presents  a formal  mathematical  representation  of 
the  optimal  (or  best)  choice  defined  by  the  technique.  Considera- 
tion is  also  given  to  the  possibility  of  using  several  methods  in 
combination  on  a given  choice  problem,  as  suggested  earlier  by 
Pinkel  (1967) . 

A more  recent  review  of  worth-assessment  techniques  has  been 
prepared  by  Kneppreth,  Gustafson,  Leifer,  and  Johnson  (1974).  In 
this  review,  methods  are  classified  in  terms  of  five  properties: 

(1)  whether  probabilities  are  used,  (2)  what  kind  of  judgment  is 
required  (e.g.,  simple  preference,  numerical  assignment)  r (3)  num- 
ber of  factors  involved  ir.  a single  judgment,  (4)  whether  uppropi  > de 
for  continuous  or  discrete  factors,  and  (5)  nature  of  output  pro- 
duced (e.g,,  ranking  of  worth,  quantitative  indicant  of  worth). 
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Especially  helpful  features  of  this  review  are  explicit  discussions 
of  what  the  authors  see  as  the  primary  advantaqes  and  disadvantages 
associated  with  each  of  the  methods  described,  and  the  provision  of 
references  for  the  theoretical  bases  of  these  techniques.  Of  par- 
ticular relevance  to  this  report  is  the  stress  that  Kneppreth,  et  al . 
put  on  the  need  for  training  before  some  of  these  techniques  can  bo 
used  effectively. 

The  fourth  step  mentioned  by  Fischer--that  of  combining  the 
results  of  factor-by-factor  assessments  into  overall  worth  esti- 
mates— has  proven  not  to  be  a difficult  one  in  many  practical  situa- 
tions because  of  the  fact  that  a simple  linear  combination  rule 
seems  to  work  remarkably  well  in  so  many  cases  (see  Section  9.3). 

Prescriptive  techniques  for  preference  specification,  or  worth 
assessment,  are  of  considerable  interest  because  of  the  potential 
that  they  represent  for  procedurizing--and  thereby,  hopi fully, 
simplifying--the  solutions  for  complex  choice  problems.  n less 
tangible  but  perhaps  no  less  important  benefit  that  can  result  from 
attempts  to  apply  such  prescriptive  techniques  in  real-world  situa- 
tions stems  from  the  fact  that  these  procedures  force  the  decision 
maker  to  be  explicit  concerning  his  own  value  system  as  it  relates 
to  the  problem  at  hand.  This  fact  has  obvious  ramifications  vis-a- 
vis  the  problem  of  evaluating  the  performance  of  decision  makers 
who  make  choices  that  affect  the  lives  of  others;  one  clearly  wants 
to  know,  in  such  cases,  not  only  what  the  choices  are,  but  the 
bases  on  which  they  are  made.  Being  forced  to  be  explicit  concern- 
ing the  factors  that  determine  his  choice  and  the  relative  importance 
that  he  attaches  fo  each  of  them  may  be  as  revealing  to  the  de- 
cision maker  himself  as  to  an  independent  observer. 


9 . 7 Preferences  among  Gambles 

So  far,  we  have  considered  only  the  problem  of  specifying 
preferences  among  stimuli  that  diffe  perhaps  in  many,  but  in  known, 
ways.  In  this  case  the  decision  maker  knows  what  the  effect  of  any 
choice  that  he  may  make  will  be.  Another  type  of  preference  speci- 
fication that  has  been  studied  involves  preferences  among  gambles, 
or  between  gambles  and  "sure  things,"  The  general  procedure  in 
such  studies  is  to  present  the  decision  maker  with  a choice,  either 
between  two  wagers,  or,  more  typically,  between  a wager  and  a sure 
thing,  and  then  to  adjust  either  the  possible  outcomes  of  the 
wager  (s)  or  the  probabilities  of  these  outcomes  until  the  decision 
maker  is  indifferent  to  the  alternatives  from  which  he  must  choose. 
By  j epeating  this  process  a number  of  times  with  different  wagers, 
one  can  generate  the  kind  of  data  from  which  worth  functions  can 
be  inferred. 
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Typically,  the  wagers  that  have  been  used  in  these  studies 
are  such  that  one  of  the  possible  outcomes  is  more  desirable  than 
the  other,  and  tht  probability  of  the  less  desirable  outcome  is 
unity  minus  the  probability  of  the  more  desirable  one.  Slovic, 

(1967,  1969),  however,  has  studied  preference  behavior  in  so-called 
duplex  gambles  in  which  the  probabilities-  of  "winning"  or  "losing" 
can  be  varied  independently  of  respective  payoffs.  In  thi  situa- 
tion, the  decision  maker  can  win  and  not  lose,  lose  and  not  win, 
win  and  lose,  or  neither  win  nor  lose.  As  Slovic  points  out,  "It 
can  be  argued  that  this  type  of  gamble  is  as  faithful  an  abstrac- 
tion of  real-life  decision  situations  is  its  more  commonly  studied 
counterpart  in  which  the  probability  of  losing  is  equal  to  unity 
minus  the  probability  of  winning  (p-,  = l-P..)  ■ For  example,  the 
choice  of  a particular  job  might  offer  some  probability  (pw)  of  a 
promotion  and  some  probability  (PL)  of  a transfer  to  an  undesirable 
location,  and  it  is  possible  that  one  of  these  event'  both  of 
them,  or  neither  of  them,  will  occur"  (p.  223) . 

In  the  first  of  Slovic' s studies,  two  different  methods  of 
indicating  the  attractiveness/unattractiveness  of  a wager  were 
explored.  One  method  required  the  subjects  to  rate  strength  of 
preference  directly  on  a scale  ranging  from  +5  (strong  preference 
for  playing)  to  -5  (strong  preference  for  not  playing) . The  second 
required  the  subject  to  equate  the  attractiveness  of  this  gamble 
with  an  amount  of  money  such  that  he  would  be  indifferent  to  play- 
ing the  gamble  or  receiving  the  stated  amount.  One  third  of  the 
subjects  assigned  to  the  second  method  were  required  to  state  the 
largest  amount  they  would  be  willing  to  pay  the  experimenter  in 
order  to  play  each  bet,  and,  for  an  undesirable  bet,  the  smallest 
amount  the  experimenter  would  have  to  pay  them  before  they  would 
play  it.  Another  third  of  the  subjects  were  given  ownership  of  a 
ticket  for  each  gamble  and  required  to  state  the  least  amount  of 
money  for  which  they  would  sell  the  ticket.  The  subjects  in  the 
final  third  were  required  to  state  a fail-  price  Tor  a given  gamble 
in  the  absence  oi  information  as  to  whether  they  or  the  experimenter 
owned  the  right  to  play  it. 

Slovic  demonstrated  that  subjects  did  not  weight  the  risk 
dimensions  in  the  same  way  wnen  bidding  as  when  rating.  Variation 
in  the  ratings  was  influenced  primarily  by  variation  in  probabilify 
of  winning  (p  ) , while  variation  in  bidding  was  influenced  primarily 
by  variation  in  probability  of  losing  (p  ) . Also,  payoff  dimen- 
sions— dollars  won  ($W)  and  dollars  lost($L)  produced  more  effect 
on  bids  than  on  ratings,  while  probability  dimensions  produced 
more  effect  on  ratings  than  on  bids.  Finally,  it  was  found  that 
when  a person  in  the  bidding  group  considered  a bet  to  be  attractive, 
his  judgment  of  its  degree  of  attractiveness  was  determined  pri- 
marily by  the  amount  ($W) ; when  he  disliked  a bet,  the  primary 
determinant  of  the  degree  of  dislike  was  ($],).  This  finding  has 
particularly  important  methodological  implications,  because,  as 
Slovic  points  out,  no  existing  prescriptive  theory  of  decision 
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making  would  consider  that  response  mode  should  be  a determinant  of 
the  way  in  which  decision  makers  utilize  probabilities  and  payoffs 
in  making  decisions  under  risk,  and  he  argues  that  behavior  in  such 
circumstances  may  bo  strongly  influenced  by  information-processing 
considerations , 

9 . 8 Preference  Specification  and  Training 

On  first  thought,  preference  specif ication- -among  all  the 
tasks  associated  with  decision  making--might  appear  to  pose  the 
least  challenge  for  training  research.  One  might  assume 
that  if  there  is  an  aspect  of  decision  making  that  conies 
naturally,  it  should  be  that  of  saying  what  one's  preferences  are. 
Things  clearly  are  not  that  simpl<  , however,  and  the  evidence  is 
abundant  that  people  do  not  always  know  what  their  preferences  are, 
or  at  least  how  to  spec ify "them  in  an  unambiguous  and  consistent 
way. 


The  research  reviewed  in  this  report  suggests  at  least  four 
problems  that  relate  to  training  and  preference  specif ication. 

First  is  the  question  of  how  to  train  people  to  make  judgments  of 
subjective  probability  that  are  independent  of  the  worths  of  pos- 
sible decision  outcomes,  as  the  use  of  subjective  expected  utility 
models  requires  (see  Section  2.2).  A second  and  closely  related 
question  is  that  of  how  to  train  people  to  make  worth  judgments 
that  are  invariant  across  different  measuring  techniques. 

The  development  of  decomposition  methods  has  been  motivated 
by  an  interest  in  simplifying  the  process  of  making  preferences, 
and  their  bases,  explicit.  As  Kneppreth,  Gustafson,  Leifer,  and 
Johnson  (1974)  have  pointed  out,  however,  some  of  these  procedures, 
particularly  those  that  yield  the  most  quantitative  results,  are 
workable  only  with  relatively  sophisticated  users.  A third  challenge 
for  training  research,  therefore,  is  to  develop  methods  for  pro- 
viding the  necessary  training  in  cost-effective  ways. 

A fourth  problem  relates  to  two  aspects  of  decision  making, 
preference  specification  and  information  gathering.  In  laboratory 
studies  of  choice,  the  dimensions  in  terms  of  which  preferences 
are  to  be  specified  typically  are  given.  In  real-world  situations, 
however,  the  dimensions  of  choice  arc  often  determined  by  the 
decision  maker  himself;  in  other  words,  the  factors  that  are  con- 
sidered in  attempting  to  assess  the  relative  merits  of  the  choice 
alternatives  are  those  that  the  decision  maker  happens  to  think 
about.  Surprisingly  little  attention  has  been  given  by  researchers 
to  the  question  of  how  capable  people  are  at  enumerating  on  demand 
the  factors  that  they  would  consider  important  in  any  particular 
choice  situation.  It  is  not  even  clear  whether,  when  provided  with 
a list  of  such  factors,  one  can  say  with  confidence  whether  the 
list  is  complete.  Much  more  research  is  needed,  both  to  deten  ine 
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human  limitations  and  performance  characteristics  in  this  regard, 
and  to  explore  how  training  might  improve  one's  ability  to  make 
one's  worth  space  explicit  vis-a-vis  specific  choice  problems. 
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SECTION  X 
ACTION  SELECTION 

Selection,  or  choice,  is  of  ten  thought  of  nr,  representing  the 
essence  of  decision  makinq.  And  obviously,  if  one  has  no  options, 
then  he  has  no  decisions  to  make.  Paradoxical ly , however,  the  act 
of  choosing  per  se  is  the  least  interesting  of  the  aspects  of 
decision  making  that  are  considered  in  this  report.  This  is  be- 
cause of  the  fact  that  when  the  other  aspects  have  been  roalized-- 
when  information  has  been  obtained,  the  decision  space  structured, 
hypotheses  generated  and  evaluated,  and  preferences  stated--the 
choice  may,  in  effect,  have  been  determined.  This  is,  of  course, 
as  it  should  be.  One’s  goal  in  all  of  these  activities  is  to  remove, 
insofar  as  possible,  doubt  about  what  the  choice  should  be. 

In  spite  of  his  best  efforts  to  reduce  uncertainty  to  a minimum, 
and  thereby  to  discover  what  his  decision  ought  to  be,  however,  the 
decision  maker  may,  on  occasion,  feel  very  much  "left  to  his  own 
devices"  when  farced  to  make  a choice.  Ellsberg  (19G1)  rather 
graphically  described  the  sense  of  frustration  that  one  can  feel 
when  he  faces  his  moment  of  truth  and  is  not  entirely  convinced  of 
the  adequacy  of  the  basis  on  which  the  choice  will  have  to  be  made. 
"(This)  judgment  of  Iho  ambiguity  of  one's  information  of  the  over- 
all credibility  of  one's  composite  estimates,  of  one's  confidence 
in  them,  cannot  be  expressed  in  terms  of  relative  likelihoods  or 
events  (if  it  could,  it  would  simply  affect  the  final,  compound 
probabi  1 ities)  . Any  scrap  of-  evidence  bearing  on  relative  likeli- 
hood should  already  be  represented  in  those  estimates.  But  having 
exploited  knowledge,  guess,  rumor,  assumption,  advice,  to  arrive 
at  a final  judgment  that  one  event  is  more  likely  than  another  or 
that  they  are  equally  likely,  one  can  still  stand  back  from  this 
process  and  ask:  'How  much,  in  the  end,  is  all  this  worth?  How 
much  do  I really  know  about  the  problem?  How  firm  a basis  for 
choice,  for  appropriate  decision  and  action,  do  I have?'  The 
answer,  'I  don't  know  very  much,  and  I can't  rely  on  that, ' may 
sound  rather  familiar,  even  in  connection  with  markedly  unequal 
estimates  of  relative  likelihood.  If  'complete  ignorance'  is  rare 
or  non-existent,  'considerable'  ignorance  is  surely  not"  (pp.  20,21).* 

Most  of  the  decision  situations  that  we  have  considered  in 
this  report  involve  the  problem  of  choosing  one  from  among  several 


*This  statement  is  contained  within  a larger  discussion  of  circum- 
stances in  which  it  may  he  "sensible"  to  act  in  conflict  with  the 
prescription  of  the  Savage  ( 1 S) S 4 ) axioms  (see  Section  2.2).  The 
reader  is  referred  to  the  full  discussion  for  an  interesting  anal- 
ysis of  the  problem  of  ambiguity  in  choice  behavior. 
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courses  of  action.  It  is  important  to  note,  however,  that  people 
sometimes  find  themselves  faced  with  the  task  of  deciding  not  what 
to  do,  but  when  to  do  it.  The  required  action  may  be  dictated 
by  circumstances,  or  predetermined  in  one  way  or  another,  but 
the  individual  is  left  with  the  job  of  deciding  on  the  best  time 
to  act.  This  type  of  decision  problem  is  nicely  illustrated  by 
the  following  situation. 

Consider  a pistol  duel  in  which  the  duelists  are  instructed 
to  turn  to  face  each  other  on  signal  and  to  fire  one  shot  at  will. 
Suppose  that  once  the  men  have  faced  each  other,  each  may  walk 
toward  the  other,  reducing  the  distance  between  them  if  he  wishes. 
We  may  assume  that  the  accuracy  of  each  duelist  improves,  although 
not  necessarily  at  the  same  rate,  as  the  distance  between  them 
decreases.  Clearly,  each  man  faces  a dilemma:  every  second  that 

he  delays  firing  in  order  to  decrease  the  distance  between  him  and 
his  opponent  and  to  increase  his  chances  of  an  accur,  te  shot,  he 
also  increases  the  chances  of  success  for  his  opponent;  on  the 
other  hand,  if  lie  fires  too  soon,  he  risks  missing,  in  which  case 
his  opponent  is  free  to  advance  on  him  until  his  shot  will  be  cer- 
tain to  find  its  mark. 

This  type  of  situation  is  representative  of  what  Sidorsky, 
Houseman,  and  Ferguson  (1964)  have  characterized  as  "implementation 
type  decision  tasks."  In  Sidorsky 's  experiments  the  duelists  were 
simulated  navy  tactical  units,  but  the  problem  was  essentially  the 
same  as  that  of  the  individual  antagonists.  The  decision  maker 
had  to  decide  when  to  fire  a missile,  knowing  that  both  the  proba- 
bility of  hitting  his  opponent  and  the  probability  of  being  hit  by 
him  were  increasing  (but  at  different  rates)  in  time. 

A particularly  interesting  result  from  thic  work  is  the  find- 
ing that  subjects  performed  less  appropria  ely  when  operating  at 
a disadvantage  than  when  operating  at  an  advantage.  One  of  the 
conclusions  that  Sidorsky  and  his  colleagues  drew  from  the  results 
of  a series  of  studies  (Sidorsky  & Houseman,  1966;  Sidorsky,  House- 
man, & Ferguson,  1964;  Sidorsky  & Simoneau,  1970)  was  that  "the 
inability  to  analyze  and  respond  appropriately  in  disadvantageous 
situations  is  a major  cause  of  poor  performance  in  tactical  de- 
cision making"  (Sidorsky  & Simoneau,  1970,  p.  57).  If  this  obser- 
vation is  generally  valid,  its  implications  for  tactical  decision 
making  are  clearly  very  significant.  The  implications  for  training 
are  also  apparent,  namely,  the  need  for  extensive  decision-making 
experience  in  disadvantageous  situations. 
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SKCTION  XI 
DPCISION  KVAMIATION 

The  problem  ol  evaluating  the  performance  of  decision  makers 
is  a difficult  one  and  it  is  critic. illy  important  to  the  task  of 
training.  Without  an  evaluation  scheme,  there  is  no  way  of  as- 
certuininq  whuther  trainin'!  has  resulted  in  an  i mprovement  in 
decision-making  jierformance.  Training  assessment  is  not  the  only 
reason  for  an  interest  in  evaluation  of  decision-making  performance, 
however.  Anyone  who  finds  himseLf  in  a position  of  having  to  pass 
judgment,  on  the  performance  ef  a decision  maker  is  in  need  of  a 
set  of  criteria  in  terms  of  which  that  judgment  can  lie  made.  More- 
over, a decision  maker  himself  might  wi sh  to  evaluate  a particular 
decision  that  ho  has  made  in  terms  of  a set  of  objective  criteria. 

Unfortunately,  a completely  satisfactory  set  ol  objective 
criteria  against  whicli  performance  can  be  compared  has  not  boon 
developed.  As  Kanurick  (1969)  has  pointed  out,  "unlike  other 
behaviors,  there  is  no  standard  dependent  variable,  such  us  time- 
on-  target  , trials  to  criterion,  or  percent  correct."  One  can,  of 
course,  choose  tor  study  in  the  laboratory  only  tasks  for  which 
performance  can  be  objectively  evaluated  (o.g. , probability  esti- 
mation for  frequent i Stic  events);  however , one  runs  the  risk  of 
thereby  excluding  from  study  a large  percentage  of  the  problems  oL 
interest.  Certainly,  in  most  real -life  decision  situations  in 
which  the  objectives  are  complex,  the  stakes  are  real,  and  the 
information  is  incomplete,  evaluation  is  an  extremely  difficult 
task . 


1 I . 1 lit  I getiveness  versus  Lojicu  1 Soundness 

ol  central  importance  to  a discussion  of  evaluation  of  de- 
cision making  is  the  distinction  between  o t I ec  t.  i venous  and  logical 
soundness.  failure  to  make  this  distinct  ion  sharply--sornef  imos 
to  malt*  it  at  a 1 1 — has  resul  fed  in  much  confusion  in  t he  litera- 
ture. Kl I ect iveness  and  logical  soundness  are  quite  different 
tilings . One  might  be  willing  to  assume  that  logically  sound  de- 
cisions will,  on  the  average,  tend  to  be  more  effective  than 
decisions  that  are  not  logically  sound.  However,  the  assumption 
that  the  correspondence  will  necessarily  hold  in  any  particular 
instance  is  manifestly  not  valid. 

A decision  is  effective  to  the  extent  that  the  result  to 
which  it  leads  is  one  which  the  decision  maker  desires.  Ilffective- 
ness  usually  is  easily  determined  after  the  fact.  The  logical 
soundness  of  a decision  depends  on  the  extent  to  which  the  de- 
cision maker's  choice  of  action  is  consistent  with  the  information 
available  to  him  at  the  time  the  decision  was  made,  and  with  the 
decision  maker's  own  preferences  and  goals.  That  these  are  quite 
d ferent  factors  is  clear  from  a simple  example.  Suppose  that 
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one  is  given  the  option  of  betting  $5  against  $20  that  the  next 
roll  of  a fair  die  will  come  up  6,  or  betting  $10  against  $12  that 
the  up  face  on  the  next  roll  will  have  an  odd  number  of  dats.  If 
he  elects  to  make  the  first  bet  and  the  roll  produces  a 6,  we  would 
say  that  the  decision  was  an  effective  one.  However,  whether  it 
could  be  considered  a logically  sound  one  would  depend  on  what  the 
decision  maker's  objectives  were.  If  his  intent  was  to  maximize 
his  potential  gain,  or  to  minimize  his  potential  loss,  the  decision 
was  sound.  If  his  intent  was  to  maximize  his  expected  gain,  it 
was  not. 

Decision-making  behavior  should  be  evaluated  in  terms  of  its 
logical  def ensibility  and  not  in  terms  of  its  effectiveness,  inas- 
much as  effective  ess  is  found  to  be  determined  in  part  by  factors 
beyond  a decision  maker's  control,  and  usually  beyond  his  knowledge 
as  well.*  It  often  appears  not  to  work  this  way  in  practice,  how- 
ever. Evaluation  of  decisions  in  terms  of  their  outc. mes  seems 
to  be  the  rule,  for  example,  in  the  world  of  finance  an’  business. 
Investment  counselors  are  hired  and  fired  on  the  basis  of  the  con- 
sequences of  their  portfolio  recommendations,  and  corporate  manage- 
ments are  frequently  juggled  as  a result  of  unsatisfactory  profit 
and  loss  statements.  Although  the  cliche  "it's  the  results  that 
count"  has  particularly  strong  intuitive  appeal  in  this  context, 
decision  outcome  is  no  more  justified  as  the  basin  for  evaluation 
of  decision  making  in  the  financial  world  than  in  any  other.  As 
Krolak  (1971)  asserts  in  a discussion  of  portfolio  management 
evaluation:  "The  real  question  to  be  answered  is  how  well  did  [I] 
do  w the  information,  capital,  strategy  and  ability  to  assume 
risk  compared  with  others  who  might  possess  the  same  resources?" 

(p.  23b ) . 

That  decision-making  performance  in  military-training  situations 
is  not  always  evaluated  in  terms  of  its  logicality,  has  been  noted 
by  Hammell  and  Mara  (1970).  In  discussing  some  of  the  mission 


Commenting  on  Fuchida  and  Okumiya ' s account  of  the  WWII  battle  of 
Midway,  Admiral  Spruance  (1955)  made  the  following  interesting 
observation:  "In  reading  the  account  of  what  happened  on  4 June, 

I am  more  than  ever  impressed  with  the  part  that  good  or  bad  for- 
tune sometimes  plays  in  tactical  engagements.  The  authors  give  us 
credit,  where  no  credit  is  due,  for  being  able  to  choose  the  exact 
time  for  our  attack  on  the  Japanese  carriers  when  they  were  at  a 
great  disadvantage--f light  decks  full  of  aircraft  fueled,  armed 
and  ready  to  go.  All  that  I can  claim  credit  for,  myself,  is  a 
very  keen  sense  of  the  urgent  need  for  surprise  and  a strong 
desire  to  hit  the  enemy  carriers  with  our  full  strength  as  early 
as  we  could  reach  them." 
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training  that  is  carried  out  in  ASW  tactical  training  programs, 
they  point  out  that  performance  evaluation  is  based,  in  many  in- 
stances, on  the  simple  effectiveness  indicator  of  whether  or  not 
the  team  scores  a hit.  If  it  does,  performance  is  judged  to  be 
acceptable.  Commenting  on  specific  traininy  exercises  that  they 
observed  they  note:  "If  a hit  was  made,  regardless  of  circumstances, 
each  team  member's  performance  was  usually  considered  good...  In 
some  instances  a hit  was  scored  because  the  target  would  make  a 
predetermined  maneuver  into  the  path  of  a torpedo  which  had  been 
obviously  fired  in  a wrong  direction"  (p.  9) . 

It  is  probably  safe  to  assume  Lhat  most  people  in  decision- 
making positions  arc  more  likely  to  be  rewarded,  or  censured,  as 
the  case  may  be,  on  the  basis  of  the  effectiveness  of  their  de- 
cisions than  on  that  of  their  logical  quality.  This  is  due  in 
part  perhaps  to  the  fact  that  society  is  far  more  interested  in 
the  results  produced  by  its  decision  makers  than  in  the  reasons 
for  which  decisions  were  made.  Tt  is  undoubtedly  also  true',  however, 
that  it  is  easier  to  determine  the  outcome  of  a decision  than  to 
determine  whether  the  decision  was  logically  justified  at  the  time 
that  it  was  taken.  One  wonders  how  many  heroes  have  been  made,  not 
in  spite  of,  but  because  of,  very  poor  decisions  which  have  had 
happy  outcomes,  and,  conversely,  how  many  "bumblers"  owe  their 
reputations  not  to  the  illogicality  of  critical  decisions  they 
have  made,  but  to  fortuitous  turns  of  events  that  have  blessed 
sound  choices  with  disastrous  results. 

We  may  note  in  passing  that  even  if  one  wishes  to  evaluate  a 
decision  in  terms  of  its  effectiveness,  rather  than  its  logical 
soundness,  the  task  may  be  less  than  straightforward . Miller  and 
Starr  f 1 9 f>  9 ) make  the  point  that  decision  objectives  are  not  always 
singular.  often,  one  is  attempting  to  realize  several  objectives 
simultaneously,  and  seldom  is  it  possible  to  optimize  with  respect, 
to  all  objectives  at  the  same  time.  It  is  difficult  in  such  cases 
to  evaluate  a decision  outcome  unless  its  implications  with  respect 
to  all  the  objectives  can  be  combined  into  a single  figure  of  merit. 

One  attempt  to  develop  a procedure  for  combining  performance 
scores  on  various  decision-effectiveness  criteria  into  a single 
figure  of  merit  was  made  by  Sidorsky  (1  972),  and  .Sidorsky  and  his 
colleagues  (1968,  1970).  A set  of  operational  criteria  that  were 
intended  to  be  used  to  evaluate  the  decision  performance  of  a 
military  tactical  unit  was  identified  as  follows:  spatial  rela- 
tionships (the  spatial  interface  between  own  and  enemy  tactical 
units) , self-concealment  (the  degree  of  success  in  keeping  the 
enemy  uninformed  concerning  own  unit) , information  generation 
(the  degree  of  success  in  keeping  informed  concerning  enemy 
unit) , weapon  utilization  (destroy  or  counterattack  capability) , 
and  conservation  of  resources  (adequacy  of  supplied).  Such 
criteria  have  been  used  by  Sidorsky  to  rate  the  quality  of 
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decisions  made  during  experimental  tactical  exercises.  A 
Decision  Response  Evaluation  Matrix  was  developed  which,  when 
used  in  conjunction  with  an  algorithm  for  combining  scores  with 
respect  to  all  live  operational  criteria,  permitted  the  quality 
of  a decision  to  be  expressed  as  a single  measure. 

11.2  Evaluation  Criteria 

Granted  that  logical  soundness  is  the  appropriate  basis  on 
which  to  evaluate  decisions,  the  problem  then  is  to  translate 
that  principle  into  a set  oi  objective  criteria  against  which 
decision-making  performance  can  be  judged.  In  view  of  the  huge 
literature  on  decision  making,  surprisingly  little  attention 
has  been  given  to  this  problem. 

Sidorsky  and  his  colleagues  (1964,  1966,  1968,  1970)  and 
Hainmell  and  Mara  (1970)  have  suggested  five  behavioral  factors  in 
terms  of  which  an  individual  decision-maker's  performance  might 
be  judged:  stereotopy  (the  tendency  of  a decision  maker  to  respond 

in  an  unnecessarily  predictable  way) , perservation  (the  tendency 
to  persist  when  persistence  is  unwarranted),  timeliness  (the 
extent  to  which  the  decision-maker's  behavior  is  reasonable  in 
terms  of  the  time  constraints  imposed  by  the  situation) , 
completeness  (the  extent  to  which  all  available  relevant  informa- 
tion is  used),  and  scries  consistency  (the  consistency  of  the 
decision-maker's  behavior  within  the  context  ol  a series  of 
interrelated  actions).  The  first  two  factors  are  liabilities 
for  a decision  maker,-  the  last  three  are  assets.  In  contrast 
with  the  operational  criteria  mentioned  in  the  preceding  section, 
these  behavioral  criteria  are  more  concerned  with  the  logicality 
of  a decision  than  with  its  effectiveness. 

The  conceptualization  of  the  decision-making  process  that  has 
provided  the  structure  ot  this  report  suggests  a number  of  dimensions 
with  respect  to  which  the  quality  ot  a decision-making  activity 
might  bo  evaluated:  the  adequacy  ot  the  information-gathering 

process;  the  sensitivity  of  data  evaluation;  the  appropriateness 
of  the  structure  that  is  given  to  a decision  problem;  the  facility 
with  which  plausible  hypotheses  are  generated,  the  optimality  of 
hypothesis  evaluation,  the  sufficiency  with  which  preferences  are 
specified;  the  completeness  ot  the  set  of  decision  alternatives 
that  is  considered;  the  timeliness  of  action  selection  and  its 
consistency  with  the  decision  maker's  preferences,  objectives,  and 
information  in  hand.  The  development  of  techniques  for  assessing 
these  asp'ects  of  decision  making  quantitatively  and  unambiguously 
represents  a challenge  to  investigators  of  decision-making  behavior. 

11.1  A Methodological  Problem 

It  is  worth  noting  that  to  determine  alter  a decision  has  been 
made  wh-ther  its  basis  was  logically  sound  may  be  a very  difficult 
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task.  People  usually  can  give  plausible  reasons  for  choices  they 
have  made.  One  may  be  permitted  a certain  amount  of  skepticism, 
however,  concerning  whether  reasons  that  are  given  after  the  fact 
are  the  reasons  that  prevailed  at  the  time  of  the  making  of  the 
choice  (Soelberg  (1967).  This  is  not  to  sug<  st  that  people  neces- 
sarily misrepresent  the  bases  for  their  decisions  intentionally. 

It  seems  not  unlikely,  however,  that  we  frequently  convince  our- 
selves, without  being  conscious  of  doing  so,  that  choices  have 
been  determined  by  certain  rational  considerations,  when  in  fact 
those  considerations  were  discovered  or  invented  only  after  the 
choice  was  made.  One  might  argue  that  even  though  the  alleged 
basis  of  a decision  may  not  have  been  verbalized,  or  even  consciously 
appreciated  by  the  decision  maker,  it  could  still  have  been  opera- 
tive at  a subconscious  level  at  decision  time.  But  this  is  a 
difficult,  if  not  impossible,  point  to  confirm  or  invalidate  ex- 
perimentally, and  for  that  reason  it  is  not  a very  useful  hypothesis. 
Pascal  (1910)  expressed  his  skepticism  concerning  the  Lredibility 
of  after-the-fact  introspective  explanations  of  behavior  over  three 
hundred  years  ago:  "M.  de  Roannex  said:  'Reasons  come  to  me  after- 
wards, but  at  first  a thing  pleases  or  shocks  me  without  my  know- 
ing the  reason,  and  yet  it  shocks  me  for  the  reason  which  I only 
discover  afterwards.'  But  I believe,  not  that  it  shocked  him  for 
the  reasons  which  were  found  afterwards,  but  that  these  reasons 
were  only  found  because  it  shocks  him"  (p.  98) . 
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SECTION  XII 

SOME  FURTHER  COMMENTS  ON  TRAIN INC  OF  DECISION  MAKERS 

Throughout  this  report  we  have  commented  on  how  the  theoret- 
ical notions  and  research  findings  that  have  been  reviewed  relate 
to  issues  of  training  and  training  research.  These  comments  have 
been  made  within  the  contexts  of  the  discussions  to  which  they 
pertain.  It  is  not  our  purpose  in  this  section  to  review  or  sum- 
marize -chese  comments,  but  rather  to  turn  to  some  training-related 
topics  that  have  not  been  addressed  elsewhere  in  the  report. 

12.1  Performance  Deficiencies  versus  Performance  Limitations 

Some  investigators  (Hammell  £■  Mara,  1970)  have  advocated 
the  approach  of  identifying  "behavioral  deficiencies"  and 
developing  training  programs  that  are  designed  to  ameliorate 
them.  Similarly,  Kanarick  (1969)  has  suggested  that  o;  ■ component 
of  a training  program  for  decision  makers  should  be  that  of  making 
them  aware  of  some  of  the  common  reasons  for  the  making  of  poor 
decisions. 

The  term  "deficiencies"  has  been  used  in  two  ways  in  the 
literature:  to  refer  to  stereotyped  ways  of  behaving  suboptimally , 
and  to  refer  to  basic  human  limitations.  In  what  follows,  we  will 
refer  to  the  second  type  of  "def iciencies"  as  limitations,  and 
use  the  word  deficiency  only  to  denote  suboptimal  but  presumably 
correctable  behaviors.  An  example  of  a behavioral  deficiency 
would  be  the  tendency  of  humans  to  be  overly  conservative  in  their 
application  of  probabilistic  information  to  the  evaluation  of  hy- 
potheses. A possible  example  of  a limitation  would  be  the  in- 
ability of  most  people  to  weigh  more  than  some  small  number  of 
factors,  without  some  procedural  help,  in  arriving  at  a preference 
among  choice  alternatives. 

The  distinction  between  deficiencies  and  limitations  has 
important  implications  for  ttaining.  Deficiencies  may  be  "trained 
out";  basic  limitations  must  be  "trained  around." 

The  first  problem  in  dealing  with  either  a putative  deficiency 
or  a limitation,  however,  is  to  verify  that  it  indeed  exists.  It 
is  obviously  imperative,  when  a deficiency  or  limitation  is  iden- 
tified by  a single  experimental  study,  that  the  finding  be  cor- 
roborated by  further  research.  More  important,  however,  and  more 
difficult,  is  the  problem  of  establishing  that  the  conclusions 
drawn  from  experimental  studies  are  valid  beyond  the  laboratory 
environments  in  which  the  results  were  obtained.  It  is  exceed- 
ingly difficult  to  capture  some  of  the  aspects  of  many  real-world 
decision  problems  (e.g.,  very  high  stakes)  in  laboratory  situa- 
tions. And  what  may  constitute  appropriate  behavior  in  the  one 
situation  may  prove  to  be  inappropriate  in  the  other. 
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Assuminq,  however,  that  one  is  able  to  identify  some  examples 
of  deficient  behavior  that  appear  to  be  fairly  universal  among 
decision  makers,  the  question  is  how  to  qo  about  training  them  out. 
One  obvious  possibility  is  to  expose  trainees  to  decision-making 
situations  in  which  a qiven  deficiency  is  likely  to  show  itself 
if  it  is  ever  going  to  do  so,  and  then  provide  tho  individual  with 
some  immediate  feedback  concerning  the  appropriateness  of  his  be- 
havior. One  would  probably  want  to  provide  numerous  opportunities 
for  the  same  deficiency  to  show  itself  in  a variety  of  contexts, 
providing  feedback  to  the  trainee  each  time  that  the  deficiency  is 
displayed.  Probably,  too,  feedback  should  be  provided  for  some 
time  after  performance  has  improved  to  the  point  that  the  deficiency 
is  no  longer  apparent. 

When  dealinq  with  basic  human  limitations,  the  goal  should  be 
to  educate  the  decision  maker  concerning  what  those  lim' lotions 
are  and  to  provide  him  with  the  means  for  working  around  them. 

For  example,  if  it  is  the  case  that  without  the  help  of  some  ex- 
plicit procedure,  a decision  maker  cannot  effectively  weigh  more 
than  n variables  in  attempting  to  optimize  his  choice  of  on  action 
alternative,  it  may  be  futile  to  try  to  train  him  to  make  effective 
use  of  more  variables;  however,  if  that  is  the  case,  he  should  be 
made  aware  of  his  limitation  and  be  trained  to  perform  within  it. 

Another  approach  to  dealing  with  deficiencies  and  1 imi  Lntions-- 
in  addition  to  training--is  that  of  providing  the  decision  maker 
with  aids  to  facilitate  various  aspects  of  the  decision  process. 

The  goals  of  training  and  of  decision  aiding  are  not  viewed  by  the 
writers  as  mutually  exclusive,  but  rather  as  complementary,  ap- 
proaches to  the  improvement  of  decision  making.  Moreover,  the 
fact  that  decision  aids  are  being  developed  has  implications  for 
training,  a point  to  which  we  wi.1L  return  in  Section  XIT1. 

12.2  Simulation  as  on  Approach  to  Training 

A common  approach  to  the  problem  of  training  decision  makers 
is  that  of  simulation  (Bellman,  Clark,  Malcolm,  Craft,  {« 

Kicciardi,  1957;  Cohen  & Rehman,  1961).  The  idea  is  to  place 
the  decision  maker  in  contrived  situations  that  are  similar 
in  certain  critical  respects  with  the  decision-making  situations 
that  they  are  likely  to  encounter  in  the  real  world.  The  approach 
has  been  used  in  ef forts  to  train  business  executives  (Martin, 

1959),  prospective  high-school  principals  (Alexander,  1967), 
research  and  development  project  managers  (Dillman  t.  Cook,  1969), 
military  strategists  and  tacticians  (Carr,  Pyrwes,  Bursky,  Linzen, 
it  Hull,  1970;  Paxson,  1963),  high-schooi  history  and  science 
teachers  (Abt,  1970),  vocationa i-education  leaders  (Rice  & Meckley, 
19/0),  and  government  planners  (Abt,  1970). 
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Mos;  business  colleges  and  graduate  schools  today  make  some 
use  of  simulation  and  gaming  techniques  to  teach  management,  and 
decision-making  skills.  Also,  as  a result  of  early  efforts  by  the 
American  Management  Association  to  develop  a decision-making 
course,  corporations  such  as  General  Electric,  Pillsbury,  West.ing- 
house,  and  Standard  Oil  of  New  Jersey  have  devised  in-house  training 
programs  that  make  use  of  simulation  techniques.  1 

Two  different  for  ns  of  management-training  games  are  discussed 
by  Cohen  and  Rhenman  (1961)  in  their  survey  of  the  present  and  fu- 
ture roles  of  such  games  in  education  and  research.  The  first 
form--the  "general-management"  game--attenipts  to  provide  experience 
in  the  making  of  business  decisions  at  a top-executive  level,  while 
the  second  form--the  "functional"  business  game— focuses  on  specific 
decision  situations  within  a limited  functional  level  of  the  organ- 
ization. Because  of  the  complexity  of  interactions  among  organi- 
zational entities  and  the  multidimersionality  of  the  uacision  envi- 
ronment simulated  in  the  general-management  games,  the  possibility 
of  defining  and  utilizing  optimal  strategies  has  not  yet  been 
demonstrated.  The  functional  game  situations,  on  the  other  hand, 
which  arc  typically  lower  in  complexity,  allow  for  the  specification 
and  application  of  optimal  or  "best"  strategies. 

A var.ir  y of  views  have  been  expressed  concerning  the  strengths 
and  weaknesses  of  simulation  as  an  approach  to  training.  Kibbee 
(1959)  suggests  the  following  advantages: 

"1)  It  (simulation)  can  provide  a dynamic  opportunity  for 
learning  such  management  skills  as  organization,  planning, 
control,  appraisal,  and  communication. 

2)  Simulation  can  provide  an  executive  with  an  appreciation 

of  overall  company  operations  and  the  interaction  between  man, 
money  and  materials.  It  helps  make  a generalist  out  of  a 
specialist  who  has  never  had  the  opportunity  of  reviewing  his 
decisions  as  they  affect  the  organization  as  a whole. 

3)  Simulation  can  provide  executives  with  practice,  insight 

and  improvement  of  their  main  function:  making  decisions, 

laced  with  realistic  decisions  about  1 ypical  business  problems, 
they  can  experience  years  of  business  activity  in  a matter  of 
hours,  in  an  environment  similar  to  that  they  face  in  everyday 
life. 

4)  Simulation  can  exhibit  what  Dr.  Forrester  of  M.I.T.  calls 
the  'dynamic,  ever-changing  iorces  which  shape  the  destiny  of 
a company.'  The  general  business  principles  that  are  illus- 
trated can  be  studied  and  understood  by  the  participants" 

(P.  8). 
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Similar  themes  are  expressed  by  Abt  (1970)  concerninq  the  efficacy 
of  management  qames: 

"Games  are  effective  teaching  and  train mq  devices  for  stu- 
dents of  all  ages  and  in  many  situations  because  they  are 
highly  motivating,  and  because  they  communicate  very  effi- 
ciently the  concepts  and  facts  of  many  subjects.  They  create 
dramatic  representatives  of  the  real  problem  being  studied. 

The  players  assume  realistic  roles,  face  problems,  formulate 
strategies,  make  decisions,  and  get  fast  feedback  on  the 
consequences  of  their  action. 

In  short,  serious  games  offer  us  a rich  field  for  a risk-free, 
active  exploration  of  serious  intellectual  and  social  problems" 
(p.  13)  . 

Simulation,  as  a general  approach  to  training  of  decision 
makers  is  not  without  its  critics,  however.  Martin  (1959),  who 
generally  endorses  the  approach,  volunteers  several  caveats.  He 
points  out,  for  example,  that  many  of  the  qualitative  dimensions 
of  a situation,  such  as  personnel  quality  and  morale  in  an  organi- 
zation being  modelled,  are  difficult  to  reflect  in  a game.  Further, 
in  order  to  make  a qame  administratively  manageable,  it  may  be 
necessary  to  limit  the  degrees  of  freedom  one  has  with  respect  to 
innovation,  which  is  an  unfortunate  constraint.  Finally,  he  points 
out  that  it  is  not  always  clear  exactly  what  students  are  learning 
in  a simulation  situation.  "There  is  no  doubt  that  the  simulation 
technique  is  a powerful  teaching  device,  and  therefore  is  poten- 
tially dangerous  unless  we  are  relatively  sure  of  what  is  being 
taught . " 

One  wonders,  in  connection  wit h the  last  point,  if  definition 
of  what  should  be  taught  and  learned  can  really  be  expected  prior 
to  development  of  an  adequate  prescriptive  thorny  of  management 
decision  making.  Moreover,  it  seems  clear  that  so  long  as  decisions 
arc  evaluated  in  terms  of  effectiveness  rather  than  in  terms  of 
logical  soundness,  the  answer  to  the  question  of  whether  any  train- 
ing program  is  teaching  individuals  to  make  optimal  decisions  will 
remain  a matter  of  conjecture.  Apropos  the  point  of  how  to  insure 
that  simulations  have  some  realism,  Freedy,  May,  Wcisbrod,  and 
Weltman  (1974)  have  proposed  a technique  for  generating  decision- 
task  scenarios  that  utilize  expert  judgments  concerning  state 
variables  and  transformations  in  much  the  same  way  that  a Bayesian 
aggregator  would  make  use  of  expert  judgments  ol  conditional  pro- 
babil ities . 


We  would  summarize  our  own  a t.itude  toward  simulation  training 
in  the  fol  lowing  way.  The  approach  has  many  advantages.  The  stu- 
dent can  be  exposed  to  a variety  of  decision  situations.  Situa- 
te n parameters  can  varied  systematic,  lly,  thus  permitting  the 
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study  of  their  effects  on  decision-making  performance.  The  conse- 
quences of  incorrect  decisions  are  not  catastrophic , as  they  could 
be  in  some  real-life  situations  of  interest.  The  student's  per- 
formance can  be  evaluated  and  immediate  feedback  can  be  provided 
to  him,  thus,  presumably,  improving  his  chances  of  learning. 

On  the  negative  side  of  the  ledger,  there  is  first  the  diffi- 
culty of  the  task  of  deciding  what  aspects  of  a situation  to  simulate. 
Any  simulation  is  a simplification,  and  if  one  wishes  to  assure 
transfer  of  what  is  learned  in  the  simulated  situation  to  real-life 
situations,  it  is  imperative  that  the  simulation  preserve  those 
aspects  of  the  real-life  situation  that  are  relevant  to  the  skill 
that  is  being  trained.  Moreover,  the  difficulty  of  assuring  the 
veridicality  of  a simulation  is  likely  to  increase  greatly  with 
the  complexity  of  the  situation  that  is  being  simulated.  Second, 
there  is  the  problem  of  generality.  Situations  are  specific.  One 
wants  the  student  to  carry  away  from  training  sessions  skills  which 
will  be  applicable  in  a varie  y of  contexts.  Simulation  itself 
does  not  guarantee  that  that  v^ill  occur.  In  fact,  one  might  guess 
that  there  would  be  the  danger  of  focusing  on  specific  aspects  of 
particular  situations  which  could  have  a tendency  to  impair  the 
learning  of  general  principles. 

12.3  On  the  Idea  of  a General-Purpose  Training  System  for 
Decision  "Makers 


A training  system  for  decision  makers  that  has  a reasonable 
degree  of  generality  is  bound  to  be  a relatively  complex  system. 
Moreover,  given  the  current  level  of  understand xng  of  decision 
processes,  it  is  unlikely  that  anyone  would  be  able  to  design  a 
system  that  would  be  certain  to  be  satisfactory.  The  approach 
that  seems  to  us  most  likely  to  prodi  ..e  a useful  system  is  an 
explicitly  evolutionary  one,  and  one  that  involves  potential  users 
of  the  system  in  its  development  from  the  earliest  stages.  What 
one  needs  to  do  is  build  a working  system  that  represents  one's 
best  guess  concerning  what  capabilities  such  a system  should  have, 
and  then  elaborate,  extetd,  and  improve  the  system  in  accordance 
with  the  insights  that  are  gained  through  attempts  to  make  use  of  it. 

The  idea  that  many  complex  systems  are  best  developed  through 
an  evolutionary  process  is  not  a new  one.  Bonington  (1964)  has 
argued  .Wrongly  for  such  an  approach  in  the  development  of  comma nd- 
and-control  systems.  Commenting  on  the  fact  that  many  systems  be- 
come obsolete  even  before  they  arc  operational,  lie  notes  that  "The 
principal  cause  of  this  situation  is  the  fact  that  until  recently 
the  proposed  users  of  these  systems  did  not  take  many  interim  steps 
that  would  have  helped  ..aem;  instead,  they  waited  for  the  grand 
solution.  When  the  development  of  these  command -and -control  systems 
was  undertaken,  it  was  thought  that  the  design  team  could  analyze 
present  operations,  project  changes  over  many  years,  design  a system 
for  the  far-off  future,  and  then  implement.  Now  most  agree  that 
this  process  iust  won't  work"  (p.  16). 
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SECTION  XT  IT 
DECISION  AIDS 

The  recognition  that--whether  because  of  behavioral  defi- 
ciencies or  basic  limitations — men  often  do  not  perform  optimally 
as  decision  makers  has  motivated  the  development  of  numerous  de- 
cisiwn-aiding  procedures  and  techniques.  The  existence  of  deci- 
sion aids  has  two  somewhat  opposing  implications  for  the  training 
of  decision  makers:  On  the  one  hand,  insofar  as  an  aid  succeeds 

in  simplifying  or  otherwise  facilitating  the  performance  of  some 
specific  task,  its  existence  may  lessen  the  training  demands  vis- 
a-vis  that  task;  on  the  other  hand,  users  of  decision  aids  must 
be  trained  to  use  those  aids.  It  does  not  follow  from  the  fact 
that  some  training  may  be  required  before  an  aid  can  be  used 
effectively  that  the  aid  is  therefore  a failure;  if  a trained 
user  of  an  aid  can  make  better  decisions  than  a trained  decision 
maker  who  does  not  use  that  aid,  then  the  aid  may  be  said  to  be 
an  effective  one. 

Given  the  view  of  decision  making  as  comprised  of  a variety 
of  tasks  and  processes,  it  seems  reasonable  to  expect  that  initial 
decision-aiding  techniques  will  be  more  successfully  applied  to 
some  of  these  tasks  than  to  others.  The  goal  should  be,  not  to 
develop  the  grand  aid  for  the  decision  maker,  but,  rather,  to 
develop  a variety  of  aids  to  facilitate  performance  of  the  various 
tasks.  Together,  a group  of  such  aids  might  be  thought  of  as  a 
"decision  support  system"  (Levit,  Alden,  Erickson,  i»  Heaton,  1974  ; 
Meadow  & Ness,  1973;  Morton,  1973),  but  the  individual  aids,  and 
not  the  system,  are  probably  the  more  reasonable  objectives  toward 
which  to  work  initially. 

Another  factor  that  some  researchers  have  argued  is  highly 
relevant  to  the  design  ol:  decision  aids  is  that  of  individual  dif- 
ferences. One  group  of  investigators,  for  example',  has  character- 
ized "decision  styles"  in  terms  ol  three  dimensions  with  respect 
to  which  individuals  are  assumed  to  vary:  abstract-concrete, 

logical-intuitive,  active-passive  (Henke,  Alden,  & Levit,  1972; 
I.evit,  Alden,  Erickson,  6.  Heaton,  '974).  All  possible  combinations 
of  the  extremes  of  these  dimensions  are  viewed  as  eight  "pure 
decision  styles”  that  are  representative  of  the  typos  of  individual- 
ized approaches  to  decision  making  that  decision-aid i ng  systems 
must  take  into  account.  The  point  that  these  investigators  make 
is  that  decision  aids  or  decision  support  complexes,  should  be 
designed  with  particular  users,  or  user  types,  in  mind.  Systems 
designed  for  one  type  of  decision  style,  they  claim,  may  degrade 
the  performance  of  a user  who  operates  according  in  a Different 
style. 
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Decision  aids  run  the  gamut  from  the  types  of  heuristic 
principles  discussed  by  Polya  (1957)  to  explicit  paper  and  pencil 
procedures  for  working  through  some  aspect  of  a decision  problem, 
to  interactive  computer-based  techniques.  In  this  section,  we 
consider  only  a few  of  the  many  aids  to  decision  making  that  have 
been  developed.  The  intent  is  not  to  provide  an  exhaustive  review 
but  a representative  sampling  of  what  has  been  done  in  this  regard. 


13.1  Linear  Programming 

Linear  programming  is  a mathematical  technique  for  determi- 
ning a set  of  decision  parameter  values  that  maximizes  or  minimiuco 
specified  functions  within  certain  linear  constraints.  The  tech- 
nique is  particularly  useful  in  solving  such  problems  as  resource 
allocation,  production  mix  and  industrial  cost  control.  It  is 
best  illustrated  by  a simple  example. 

Suppose  a manufacturer  produces  three  products.  We  will 
designate  the  monthly  quantities  of  these  products  as  x^,  x2  and 
x3.  The  products  have  different  unit  production  costs,  say, 
a^,  a2,  and  a?,  and  different  unit  sale  prices,  say,  b-^ , b2,  and 
b^.  To  keep  the  illustration  simple,  we  ignore  the  problem  of 
inventories.  Raw  material  limitations  restrict  the  number  of  units 
of  products  1 and  3 that  can  be  produced  per  month  to  c,  and  c2  , 
respectively.  The  total  number  of  man-hours  available  to  the 
producer  is  n per  mcnth,  and  it  requires  dj,  d2,  and  d^  man-hours 
to  produce  one  unit  of  products  1,  2,  and  3,  respectively.  The 
problem  is  to  determine  the  number  of  units  of  each  product  that 
the  manufacturer  should  produce  per  month  in  order  to  maximize 
his  profit. 

Linear  programming  is  a technique  for  solvinq  such  problems, 
when  solutions  exist.  The  technique  involves  expressing  the  con- 
straints as  a set  of  simultaneous  linear  equations,  and  then 
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searching  within  the  ranges  of  the  values  of  the  independent 
variables  that  satisfy  the  equations  for  those  values  that  op- 
timize the  desired  function.  In  the  case  of  our  example,  the 
function  to  be  optimized  (in  this  case,  maximized)  would  be  the 
profit  function,  i.e., 

(b2~a2)x2  + (b3-a3)x3. 


When  the  problem  involves  only  two  or  three  decision  vari- 
ables, a geometrical  model  of  the  situation  can  give  the  decision 
maker  an  intuitively  meaningful  representation  of  the  significance 
of  the  various  factors  and,  in  particular,  of  the  sensitivity  of 
the  decision  outcome  to  a less  than  optimal  selection  of  values 
for  the  decision  variables.  When  certain  boundary  conditions  are 
met,  the  set  of  parameter  values  that  satisfies  the  linear  con- 
straints within  which  the  decision  must  be  made  is  represented 
by  convex  polygons  or  polyhedra  (in  the  two-  and  three-variable 
cases,  respectively),  and  the  solution  to  the  optimization  problem 
invariably  is  (or  at  least  contains)  one  of  the  figure's  vertices. 
The  same  principle  holds  in  cases  of  more  than  three  variables, 
but,  of  course,  the  geometrical  model  is  no  longer  helpful. 

One  of  the  limitations  of  linear  programming  is  the  fact 
that  it  is  applicable  only  to  situations  in  which  the  decision 
space  has  been  fully  represented  numerically  and  the  outcomes  of 
all  of  the  admissible  decisions  are  known.  Another  is  the  fact 
that  it  can  be  used  only  when  the  effects  of  the  individual  deci- 
sion variables  combine  in  an  additive  (linear)  fashion.  One  can 
imagine  real-life  decision  situations  in  which  the  effect  of  a 
change  in  the  value  of  one  decision  variable'  depends  in  some  way 
on  the  value  of  another  variable.  For  example,  how  much  impor- 
tance one  would  attach  to  a difference  in  salary  between  two  jobs 
might  depend  on  whether  tire  jobs  also  differed  significantly  in 
terms  of  the  extent  to  which  they  placed  one's  life  in  danger. 

As  has  already  been  noted  in  Section  IX  of  this  report,  however , 
several  investigators  of  decision  making  have  argued  t.hal  the 
assumption  of  additivity  appears  to  ire  a reasonable  one  in  many, 
if  not  most,  real-life  situations.  Probably  the  more  difficult 
requirement  to  satisfy  in  order  ter  use  1 inear  programming  tech- 
niques is  that  of  adequately  structuring  the  decision  space  and 
quantifying  the  salient  variables.  When  the  necessary  conditions 
can  be  met,  however,  there  can  be  no  doubt  of  the  effectiveness 
of  the  technique. 


11.2  Decision  Trees  and  Flow  Diagrams 


Sometimes 
set  of  written 
procedure  into 
version  can  bo 


it  is  possible  to  convert  an  apparently  complex 
or  verbal  instructions  concerning  a problem-solving 
a decision  tree  or  flow  diagram.  When  such  a eon- 
accomplished,  it  is  often  found  that  the  desired 
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procedure  is  more  easily  and  efficiently  followed  with  the  aid  of 
the  diagram  than  with  the  original  set  of  instructions  (Blaines, 
1971;  Raiffa,  1968;  Wason,  1968;  Wright,  1971). 

The  following  distinction  between  decision  trees  and  flow 
diagrams  is  made  by  Triggs  (1  973):  "A  decision  tree  is  an  assembly 
of  individual  paths  in  a str1  lure  organized  so  that  no  path  cvr 
returns  or  proceeds  to  anothe  part  of  the  diagram.  A decision 
flow  diagram  may,  on  the  other  hand,  contain  paths  that  return 
to  early  parts  of  the  diagram  or  feed  to  other  common  elements. 

A decision  flow  diagram  can  be  more  operationally  directive  in  its 
structure,  and  less  concerned  with  the  explicit  details  of  the 
decision  process.  In  a tree  structure,  at  everv  node  of  the  tree, 
the  user  of  the  diagram  can  exactly  state  by  what  set  of  chance 
events  and  decisions  one  arrived  there.  The  flow  diagram  structure 
is  not  always  organized  so  that  each  such  path  can  be  uniquely 
specified"  (p.  3)  . 

The  clarity  and  efficiency  gained’  by  representing  procedures 
requiring  sequential  decisions  in  diagrammatic  form  have  >oen 
recognized  for  some  time.  In  such  fields  as  computer  programming 
and  systems  analysis,  graphic  techniques  have  been  employed  in 
the  teaching  and  conduct  of  specific  programminu,  debugging,  main- 
tenance, and  troubleshooting  tasks.  Only  recently,  however,  have 
formal  attempts  been  made  to  assess  the  benefits  to  be  derived.  In 
an  entertaining  article  by  Davies  (1970)  the  results  of  a relevant 
experiment  by  B.  N.  Lewis  are  discussed.  The  latter  investigator 
presented  a series  of  six  problems  involving  a tax  regulation  to 
each  of  60  subjects.  One  third  of  the  subjects  worked  with  the 
original  (prose)  statement  of  the  regulation,  a second  third  worked 
with  a simplified  (prose)  statement,  and  the  final  third  worked 
with  an  algorithmic  (decision  tree)  form.  The  mean  time  required 
by  the  original  prose  group  to  solve  all  six  problems  was  23.4 
minutes,  compared  to  11.8  minutes  required  by  the  simplified  prose 
group  and  2 minutes  required  by  the  algorithm  group.  Mean  errors 
in  problem  solution  followed  a similar  pattern:  29%,-  10%,  and  8% 

for  the  respective  groups. 

More  recently,  Blaiwes  (1973)  compared  the  performance  of 
decision  makers  who  had  been  given  instructions  concerning  the 
construction  and  use  of  decision  trees  with  that  of  decision 
makers  who  had  not  been  so  instructed.  Only  one  of  the  ten  subjects 
in  the  uninstructed  group  gave  evidence  of  using  a dec ision-tree 
approach  io  the  solution  of  the  tour  experimental  tasks,  whereas 
all  ten  of  the  instructed  subjects  used  it.  Subjects  usinq  the 
decis ion-tree  approach  initially  required  more  time  than 
uninstructed  subjects,  but  their  performance  improved  as  they 
gained  facility  with  the  approach.  Most  importantly,  subjects 
in  the  instructed  group  performed  at  a higher  level  oi 
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accuracy  than  subjects  in  the  uninstructed  group.  Although  the 
possible  effects  due  to  practice  cannot  be  separated  from  those 
due  to  problem  difficulty  because  of  the  particular  design  used 
by  Blaines,  we  regard  the  experiment  as  a demonstration  of  the 
ease  with  which  the  decision-tree  approach  can  be  taught  to 
individuals  who  have  not  previously  encountered  it. 

A review  of  numerous  attempts  to  apply  decision  trees  and 
flow  diagrams  to  the  solution  of  decision  problems  (e.g..  Baker, 
1967;  Clarkson,  1963;  Dutton  & Starbuck,  1971;  Horabin,  1972; 

Howard,  Matheson,  & North,  1972;  Rousseau  & Zamora,  1972;  Tudden- 
ham,  1968)  has  been  prepared  by  'i'riggs  (1973).  He  points  out 
that  the  degree  to  which  such  aids  can  be  useful  to  a decision 
maker  will  depend  on  the  nature  of  the  problem  that  is  faced. 

They  tend  to  be  most  useful  for  situations  that  are  easily  struc- 
tured, perhaps  by  means  of  decomposition  techniques  advocated 
by  Raiffa  (1968).  Triggs  cautions  against  the  temptation  "to  make 
a complex  problem  tractable  by  forcing  it  into  a conceptual  repre- 
sentation with  which  one  knows  how  to  cope,"  at  the  expense  of 
ignoring  or  eliminating  critical  aspects  of  the  real  problem. 

He  also  points  out  that  the  task  o£  imposing  the  type  of  structure 
on  a decision  problem  that  is  necessary  if  decision  trees  or  flow 
diagrams  are  to  be  used  to  advantage,  may  be  sufficiently  time- 
consuming  and  expensive  to  assure  its  impracticality  in  some  dy- 
namic situations  in  which  the  time  for  analysis  is  limited.  More- 
over, forcing  the  decision  maker  to  think  about  his  problem  in 
terms  of  a specific  structure  may  inhibit  his  use  of  cognitive 
skills  that  he  otherwise  might  bring  to  the  task.  Triggs  concludes, 
however,  that  on  balance  these  cautions  do  not  negate  the  efficacy 
of  the  approach.  Citing  Zudeh's  (1973)  work,  he  notes  that,  "oven 
in  systems  that  are  too  complex  or  too  ill-defined  to  admit  of 
precise  quantitative  analysis.  Muzzy'  algorithms  and  diagrams 
have  the  potential  of  being  useful  to  the  human  decision  maker" 

(p.  17). 

A lucid  tutorial  treatment  of  decision  trees  and  their  use 
is  presented  by  Peterson,  Kelly,  Barclay,  hazard,  and  Brown  (1973) 
in  Chapters  2 and  3 of  a Handbook  lor  Decision  Analysis.  The 
handbook  has  been  prepared  for  the  express  purpose  of  aiding  the 
individual  who  is  faced  with  substantive  decision  problems  to 
apply  concepts  and  procedures  of  decision  theory  to  the  solution 
of  those  problems. 

13.3  Del  phi , an  Aid  to  Group  Decision  Making 

The  decision  maker  of- most  prescriptive  models  of  decision 
making  could  be  an  individual,  a <ommittee,  a corporation,  or  a 
machine,  inasmuch  as  such  models  arc  concerned  with  the  decision- 
making process  and  are  indifferent  to  the  nature  of  its  embodiment. 
Most  empirical  studies  of  decision  making,  however,  have  focused 
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on  the  behavior  of  individuals.  Relatively  little  attention  has 
been  given  to  the  question  of  how  decisions  are,  or  should  be, 
made  by  n-person  groups.  There  are,  of  course,  large  liter- 
atures dealing  with  related  topics  such  as  the  effects  of  group 
organization  and  communication  channels  on  problem  solving,  and 
the  effects  of  group  pressures  on  individual  behavior. 

One  generalization  that  it  seems  safe  to  make  is  that  the 
decision-making  performance  of  groups  may  be  influenced  by  a 
number  of  factors  that  are  not  obviously  related  to  decision 
quality  in  any  straightforward  way.  Especially  is  this  true  when 
group  members  are  required  to  resolve  problems  about  which  there 
exist  conflicting  views.  As  Helmer  (1967)  puts  it: 

"Round-table  discussions  for  such  purposes  have  certain 
psychological  drawbacks  in  that  the  outcome  is  apt.  to  be  a 
compromise  between  divergent  views,  arrived  at  u , 1 too  often 
under  the  undue  influence  of  certain  factors  inherent  in  the 
face-to-face  situation.  These  may  include  such  things  .as 
the  purely  specious  persuasion  of  others  by  the  member  with 
the  greatest  supposed  authority  or  even  merely  the  loudest 
voice,  an  unwillingness  to  abandon  publicly  expressed 
opinions,  and  the  bandwagon  effect  of  majority  opinion"  (p.  9). 

As  one  means  of  remedying  these  types  of  problems,  and  of 
providing  a rationale  by  which  to  combine  "expert"  opinions,  the 
Delphi  method  was  created  (Brown,  196B;  Dalkey  f*  Helmer,  .1963  ; 
Helmer,  1967;  Reseller,  1969).  This  technique  requires  each  member 
of  the  group  to  write  down  his  independent  assessment  of  the  prob- 
lem or  solution  under  study.  The  set  of  assessments  is  then 
revealed  to  all  members  but  without  identification  of  which  parti- 
cular assessment  was  made  by  which  member.  The  pros  and  cons  of 
each  response  are  then  openly  debated  and  each  member  files  a 
second  assessment.  Following  n repetitions  of  this  procedure, 
the  median  assessment  is  then  adopted. 

The  Delphi  procedure  is  reputed  to  be  usable: 

"1)  To  determine  what  the  operative  values  of  a group  are, 
what  relative  weight  they  have,  what  sorts  of  possible  trade- 
offs obtain  among  them,  and  the  like. 

2)  To  explore  the  sphere  of  value  cr itorioloqy , clarifying 
by  what  criteria  the  values  of  a group  come  to  be  brought  to 
bear  upon  actual  cases. 

3)  To  discover  divergences  of  value  posture  within  a group 
and  the  existence  of  subgroups  with  aberrant  value  structures. 
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4)  To  serve  as  a tool  for  seeking  out  areas  of  value 
consensus— or  agreement  as  to  actions  and  preferences-- 
that  may  exist  even  when  there  are  conflicts  of  value. 

5)  To  provide  a tool  for  the  third-party  evaluation  of 
conflicts  of  interest. 

6)  To  assess  the  correctness  of  value  ascriptions  to  given 
groups . 

7)  To  assess  the  correctness  of  value  judgments  in  the  area 
of  means-values"  (Rescher,  1969,  p.  17  ) . 

The  use  of  a modified  version  of  the  Delphi  technique  is 
illustrated  in  a recent  effort  by  O'Connor  (1972)  to  apply  expert 
judgment  to  the  scaling  of  water  quality.  The  problem  was  to 
assess  the  quality  of  water  to  be  used  (1)  as  a public  supply, 
and  (2)  for  the  maintenance  of  u fish  and  wildlife  population. 

Eight  experts  made  iterative  judgments  as  to  the  parameters  to  be 
included,  the  relative  importance  weights  to  be  assigned,  and  the 
rules  for  combination  of  indices.  Good  consensus  was  obtained 
with  respect  to  sets  of  judgment  parameters  and  combination  rules, 
but  there  was  considerable  disagreement  on  weightings.  O'Connor 
found,  however,  that  this  disagreement  was  not  critical  in  the 
development  of  the  final  indices. 

An  important  feature  of  the  Delphi  technique  is  the  fact  that 
it  provides  a means  for  achieving  group  consensus  without  the  need 
for  the  face-to-face  discussion  ol  issues  which  typifies  most 
group  problem-solving  methods.  This  characteristic  was  exploited 
in  the  O'Connor  study,  where  the  experts  were  geographically  widely 
separated  and  were  never  in  direct  communication  with  each  other. 

11.4  Comput e r - based  Decision  A id s 

The  potential  advantages  to  be  gained  from  applying  the 
general  computational  capabilities  of  digital  computers  to  deci- 
sion problems  have  been  recognized  for  some  time.  Several  writers 
have  made  very  convincing  arguments  to  the  effect  that  both  men 
and  computers  have  something  to  offer  Lu  the  decision-making 
process,  and  that  the  need  is  for  the  development  of  decision 
systems  that  assure  a symbiotic  coupling  of  the  capabilities  of 
man  and  machine (Hrigqs  & Sc hum,  1965;  Edwards,  1965b;  Licklider, 
1961;  Shuford,  1965;  Yntcma  & Klem,  1965;  Yntomu  i.  Torgerson , 1961). 

it  is  not  difficult  to  imagine  a computer  system  being  used 
to  aid  a decision  maker  in  the  performance  of  essentially  all  of 
the  aspects  of  decision  making  that  we  have  considered  in  fore- 
going sections  of  this  report.  Such  a system  might  provide  the 
decision  maker  with  a data  base  of  facts  or  observations  that  are 
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relevant  to  his  decision  problem.  It  could  serve  as  an  extension 
of  his  own  memory  by  keeping  a record  of  factors  that  he  had  in- 
dicated he  ought  to  "keep  in  mind"  in  making  a decision.  It  could 
help  him  generate  hypotheses,  and  to  structure  and  present  the 
decision  space.  It  could  help  him  discover  what  his  preferences 
are  and  to  express  them  in  a quantitative  way.  It  could  provide 
graphical  representations  of  the  decision  situation.  It  might 
(assuming  a valid  model  of  the  decision  problem)  project  the  prob- 
able consequences  of  various  action  selections.  It  might  serve 
as  an  interface  between  two  or  more  decision  makers'  collaborating 
on  the  same  problem  and  facilitate  the  application  of  group  deci- 
sion techniques.  It  could  do  whatever  computation  was  required. 

It  could  prod  the  decision  maker  to  consider  aspects  of  the  problem 
that  he  otherwise  might  overlook.  It  could  suggest  approaches  or 
strategies  that  have  been  found  to  be  useful  in  similar  problem 
situations.  It  could  make  explicit  to  the  decision  n ker  (either 
by  inference  or  by  questioning  of  the  decision  maker  himself)  some 
aspects  of  the  situation  or  the  decision  maker's  thinking  that 
otherwise  would  only  be  implicit.  And  so  on. 

It  is  in  fact  so  easy  to  imagine  ways  in  which  the  computer 
could  be  used  as  an  aid  for  decision  making  that  one  can  be  seduced 
to  thinking  that  the  implementation  of  such  capabilities  is  a 
straightforward  thing.  In  some  instances  this  is  perhaps  the  case; 
in  others,  it  assuredly  is  not.  The  important  point  is,  however, 
that  computer-based  decision  aids  are  being  developed  and  quite 
sophisticated  ones  are  likely  to  be  operational  in  the  near  future. 
No  training  program  for  decision  makers  can  afford  to  ignore  this 
fact . 


In  a preceding  section  of  this  report  some  comments  were  made 
concerning  simulation  as  an  approach  to  training.  (liven  the  avail- 
ability of  cromputer  systems  to  decision  makers,  another  way  that 
simulation  may  be  used  to  advantage  is  as  an  operational  decision 
aid.  In  this  case  the  effects,  or  probable  effects,  of  selecting 
specific  action  alternatives  can  be  explored  by  the  decision  maker 
before  he  actually  makes  his  choice  (Ferguson  & Jones,  1969)  . The 
projections  or  predictions  of  the  aid  will  only  be  as  good,  of 
course,  as  is  the  model  of  the  situation  that  produces  them,  and 
it  is  not  necessarily  the  ease  that  the  use  of  such  predictive 
aids  will  invariably  lead  to  improved  performance  (Sidorsky  Mara, 
1968).  The  potential  for  this  type  of  simulation  is  great,  however, 
and  deserves  more  attention  that  it  has  received  to  date.  At  the 
very  least,  such  an  aid  can  be  used  to  help  determine  what  is 
possible  and  what  is  not,  giving  an  accurate  representation  of  the 
current  state  of  affairs.  The  point  is  illustrated  by  an  experi- 
mental decision  aid  designed  to  monitor  and  control  maritime 
traffic  (Elmalph,  Prywes,  f»  C.ustafemo,  1 967  ).  The  .system  was  com- 
posed of  a formatted  data  base,  <1  set  of  "worker  programs"  which 
operated  on  the  data  base,  and  a query  language  which  allowed  the 


NAVT RAKQU I PC KN  73-C-0128-] 


user  to  interact  with  the  data  base  on  line.  Information  'hat 
could  be  extracted  from  the  data  base  on  request  uvluhil"  "(1) 
past,  present,  or  future  locations  ot  ships.  f2>  'he  number,  type, 
or  names  of  ships  in  any  geographic  area  of  the  North  Atlantic 
at  a past,  present,  or  future  time,  and  (5)  how  far  is  a ship 
from  some  particular  place  and  if  ordered  to  change  course  when 
can  it  qet  there?"  (p.  206).  The  system  could  provide  information 
on  sets  of  ships  satisfying  some  class  description;  for  example, 
it  could  provide  the  distances  ot  all  ships  ot  t liven  type,  from 
a given  destination,  and  the  time  required  to  reach  that  destina- 
tion, assuming  the  necessary  change  in  course.  The  system  illus- 
trates a nice  allocation  of  function  between  mu  and  machine. 

The  computer  does  the  bookkeeping  and  arithmetic,  the  man  txei- 
ceses  judqment  and  makes  choices.  Hopefully,  the  choices  that 
the  man  makes  will  be  the  better  because  of  tin  1 ookkeepinu  and 
aritlimetic  that  the  machine  docs. 

Two  of  the  more  prominent  problem  areas  for  which  c mputer- 
based  decision  aids  have  been  developed  or  planned  are  mecicine 
and  military  tactics. 

13.4.1  Computer-Based  Aids  tor  Medical  Decision  Making 

Among  the  first  investigators  to  attempt  to  apply  modern 
decision  theory  to  medicai  decision  making  were  Ledley  and  Lusted 
(1959).  During  the  subsequent  fifteen  years,  many  such  applica- 
tions of  decision  theoretic  techniques  were  proposed  and  tried; 
and  within  the  past  ten  years,  several  experimental  computer-based 
systems  have  been  developed  for  the  purpose  of  facilitating 
various  aspects  of  decision  making  in  the  medical  context.  Ap- 
plications that  have  been  explored  include  initial  patient  inter- 
viewing and  symptom  identification  (Griest,  Klein,  & V^nCura,  1973; 
Whitehead  & Castleman,  1974),  analysis  organization  and  presenta- 
tion of  the  results  of  laboratory  tests  (Button  f»  C.ambino,  1973)  , 
personality  analysis  (Kleinmuntz,  1968;  Lusted,  1965),  storage  and 
retrieval  of  individual-patient  data  (Collen,  1970;  Greene,  L969), 
on-demand  provision  to  practitioners  of  clinical  information 
(Siegel  & Strom,  197:!),  automated  and  computer-aided  diagnosis  of 
medical  problems  (Cumberbatch  s,  Heaps,  1973;  Fisher,  Fox,  k Newman, 
1973  ; Fleiss,  Spitler,  Cohen,  Endicctt,  1972;  Gledhill,  Mathews 
6.  Mackay,  1972;  Horrocks  & deDombal  , 1973  ; Jacquez,  1972;  Locwick, 
J965;  Lusted,  1965;  McGirr,  1969;  Vch,  Betyar,  s Hon,  1972), 
management  and  graphical  representations  of  data  to  aid  research 
in  pharmacology  and  medicinal  chemistry  (Castleman,  Russell,  Webb, 
Hollister,  Siegel,  Zdonik,  & Frani,  1974),  modelling  of  physiological 
systems  and  exploration  via  simulation  of  the  effects  of  alternative 
courses  of  treatment  (Seigel  k,  Farrell,  1973),  and  training 
(Feurzeig,  1964;  Feurzeig,  Munter,  Swets , & Breen,  1964). 
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Tho  results  of  one  recent  study  of  compul.i.T-assi  st  cd  diagnosis 
are  particularly  relevant  to  the?  question  of  when  expert  judqmont. 
should  or  should  not  be  used  in  the  decision  process.  Lea per  (19/2, 
1975)  compared  two  methods  of  computer-assisted  diagnosis  oi  dis- 
orders for  which  abdominal  pain  was  a primary  symptom  (e.g.,  appen- 
dicitis, diverticulitis,  perforated  ulcer).  Computer-aided  Bayesian 
diagnoses  were  performed  us.inq  estimates  of  probabilities  that  were 
either  (a)  inferred  from  frequency  data  collected  from  600  patients 
or  (b)  produced  by  a group  of  clinicians.  The  diagnoses  that  resultc 
from  the  computer-aided  method  that  used  the  clinicians'  proba- 
bility estimates  were  marqinally  more  accurate  than  those  produced 
by  unaided  clinicians  (82°  versus  80?,).  The  method  that  made  use 
of  probabilities  inferred  from  incidence  data,  however,  gave  sig- 
nificantly more  accurate  results  (91%).  A secondary  result  of 
this  study  that  is  of  sonic  interest  is  the  fact  that  most  clinicians 
insisted  on  retaining  their  own  probability  estimates,  even  when 
those  estimates  were  greatly  different  f^om  the  survey  data  and 
they  had  been  informed  of  this  fact. 

These  results  strongly  suggest  that  relative  frequency  data 
should  be  used  as  a basis  for  probability  estimates  in  preference 
to  expert  opinions,  if  such  data  arc  available.  The  principle 
should  not  be  applied,  of  course,  without  due  regard  for  such  fac- 
tors as  the  size  and  representativeness  of  the  samples  from  which 
the  relative  frequency  data  are  obtained.  As  a general  rule,  the 
most  defensible  strategy  in  estimating  probabilities  would  seem 
to  be:  use  expert  judgments  only  if  a more  objective  method  is  not 

feasible,  as  would  be  the  case  when  estimating  the  probabilities 
of  very  low-frequency  events  or  events  that  are  not  reasonably 
thought  of  as  " frequentistic"  in  nature. 


13.4.2  Computer 


-Based  Aids  for  Tactical  Decision  Makinq 


Much  has  been  written  ab-ut  the  use  ot  computer-based  aids 
to  facilitate  decision  making  in  the  context  of  tactical  operations 
(Melon,  Levit,  & Henke,  1973;  Baker,  1970;  Bennett,  Degun,  & bpiegcl, 
1964;  Bowen,  Feehrer,  Nickerson,  & Triggs,  1975;  Bowen,  loehrer, 
Nickerson,  Spooner,  & Triggs,  1971;  Bowen,  Ilalpin,  Long,  Lukas, 

Mu liar key,  & T ygs,  1973;  Frocdy , Weisbrod,  May , Schwartz , 
man,  1973;  Gagriardi,  Bussey,  Kaplan,  t,  Matten,  l'n>5;  Hanes 
1966;  Levit,  Alden,  t»  Henke,  1973;  Levit,  Alden,  hnckson, 

1974;  Sidorsky  & Simoneau,  1970).  The 

and  aids  have  led  to  improved  decision  making  is  probably  impossible 
to  determine.  It  is  easy  to  be  critical  ol  this  work, 
because  progress  has  certainly  not  been  spectacular, 
be  tha  ’■  some  of  the  decision-aiding  efforts  have  been  poorly 
conceived  But  tactical  decision  making  is  complicated  and  not 
thoroughly  understood.  It  is  not  surprising  that  there  would  be 
some  false  starts  before  significant  progress  is  made  on  this 
problem.  Even  false  starts  can  provide  useful  insights  info  a 
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problem,  however;  if  nothing  more,  they  should  help  to  clarify 
what  t.he  dimensions  of  the  problem  are  ami  to  provide  some  clues 
concerning  the  requirements  for  i solution. 

Bowen,  Nickerson,  Spooner,  and  Triqqs  (1970)  have  described 
several  computer-based  systems  that  have  been,  or  are  being,  de- 
veloped by  the  military  services  to  aid  the  decision-making  process 
in  tactical  situations.  Among  the  systems  that  were  reviewed  are: 
the  Army's  Tactical  Operations  System  (TOS)--in  particular,  TOS- 
7th  Army--and  Tactical  Fire  Direction  System  (TACFIIUI)  , the  Air 
Force  ana  Marine  Corps'  Tactical  Information  Processing  and  Inter- 
pretation System  (TIPI),  the  Air  Force's  Intelligence  Data  Hand- 
ling System  (IDHS),  and  the  Navy's  Integrated  Operational  Intel- 
ligence System  (IOIS) . These  systems  are  intended  to  improve 
tactical  decision  making  by  facilitating  data  management  and  mani- 
pulation, message  routing,  display  generation,  report  preparation, 
fire  control,  planning,  resource  allocation,  and  other  .asks  and 
functions  that  fall  within  the  purview  of  tactical  operations. 

There  are  two  motivations  lor  bringing  such  systems  into  the 
tactical  situation.  One  is  to  unburden  the  decision  maker  of  tasks 
that  arc  just  as  well  performed  by  machines,  and  thereby  make  it 
possible  for  him  to  devote  more  time  to  those  aspects  of  decision 
making  that  require  human  judgment  and  expertise.  The  other  is 
to  upgrade  the  quality  and  adequacy  of  the  information  on  which 
decisions  are  based.  This  involves  not  only  the  problem  ol  proces- 
sing and  integrating  large  amounts  of  ir  ‘‘ormat ion , but.  also  that 
of  packaging  and  presenting  inlormntio'  'n  ways  that  are  well- 
suited  to  the  information-process  i ng  cap..  lilies  of  the  human 
being  who  must  make  use  of  it.  How  effectively  existing  or  con- 
templated systems  realize  these  objectives  is  diflicult  to  deter- 
mine with  much  precision. 

It  is  not.  Hit..'  purpose  ol  this  review  t:o  describe  particular 
systems  in  detail.  We  will,  however,  consider  briefly  two  systems 
as  illustrative  of  those  that  have  been  developed,  one  intended 
for  operational  use,  and  one  for  use  as  a training  instrument  . 

1 1 . 4 . 2 . 1 AESOP 

An  intensive  program  to  develop  an  on-line  i u t orma  t ion-v<  m Lrol 
system  of  value  to  military  decision  makers  in  the  planning  of 
tactical  and  .strategic  resource  <il  locations  was  begun  by  the  Mitre 
Corporation  in  1964.  On  completion  in  1969,  the  prototype,  called 
"An  Evolutionary  System  for  On-line  Planning  (AESOP)  to  emphasize 
its  incremental,  approach  to  the  generation  ot  computer-based 
management  and  planning  assistance,  made  available  to  system  users 
a range  of  techniques  which  could  aid  in  such  diverse  activities 
as  data  acquisition,  aggregation,  plan  assessment  and  report 
preparation . 
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The  AESOP  system  consists  of  two  major  parts.  One  of  these 
is  a set  of  capabilities  for  storing,  modifying,  retrieving,  and 
displaying  data,  and  for  performing  various  sorts  of  symbolic  and 
arithmetic  manipulations  with  the  aid  of  a flexible  display- 
oriented  user  language,  a light  pen,  typewriter  and  push-buttons. 
Details  of  these  aspects  are  covered  in  a variety  of  program  pub- 
lications, the  most  informative  of  which  are  Eennett,  Haines,  and 
Summers  (1965)  and  Summers  and  Bennett  (1967). 

The  second  part  of  the  system  consists  of  a set  of  simulated 
strategic  and  tactical  military  applications  which  provide  a 
context  for  exercising  the  capabilities  mentioned  above.  One  of 
the  more  significant  of  these  is  that  of  a Tactical  Air  Control 
Center  (TACC)  in  which  the  resource  allocation  tasks  of  a Fighter 
Section/Current  Plans  Division  are  simulated.  Since  this  parti- 
cular application  also  served  as  a testbed  for  the  formal  test  and 
evaluation  of  AESOP  principles,  it  provides  the  most  comprehensive 
picture  of  the  strengths  and  weaknesses  of  the  system.  The  remain- 
der of  our  current  summary  will  relate  to  this  application  and  to 
the  results  of  evaluation  studies.  More  detailed  treatments  of 
the  simulation  and  evaluation  can  be  found  in  Doughty  (1967), 

Doughty  and  Feehrer  (1969) , and  Doughty,  Feehrer,  Bachand  and 
Green  (1969)  . 

As  simulated  in  the  AESOP  program,  the  basic  task  of  a Fighter 
Section  revolves  about  the  allocation  (on  request  by  higher  head- 
quarters) of  tactical  aircraft  to  each  of  three  mission  categories: 
(1)  on-call  close  air  support,  (2)  preplanned  close  air  support, 
and  (3)  preplanned  counter-air  and  interdiction.  Under  "normal" 
circumstances  the  total  number  of  ready  aircraft  in  near  proximity 
to  prescribed  target  areas  is  less  than  the  number  of  aircraft 
requested,  so  the  planner  is  forced  to  make  tradeoffs  relating  to 
such  factors  as  sortie  rate,  flying  time,  time  over  target,  and 
probable  degree  of  target  destruction.  The  cumulative  consequences 
of  these  tradeoffs  are:  (1)  that  some  requests  for  support  fail  to 

be  satisfied  at  all,  (2)  some  requests  fail  to  be  satisfied  on  a 
timely  basis,  and  (3)  some  requests,  though  satisfied  on  a timely 
basis,  are  not  satisfied  at  the  required  level. 

In  this  context,  the  tactical  version  of  AESOP  has  two 
interrelated  goals:  (1)  the  elimination  of  much  of  the  labor  and 
inaccuracy  associated  with  manual  computation  and  display  of  ready 
resources,  sortie  rates,  flying  times  and  weapons'  effects,  and 
with  the  preparation  of  formal  orders  (Fragmentary  Orders)  to 
squadrons  implicated  in  a planned  allocation,  and  (2)  facilitation 
of  the  problem-solving  activity  of  decision  makers,  that  is,  of  the 
judicious  selection  of  squadrons,  aircraft  types,  weapons  categories, 
and  so  on. 

For  purposes  of  evaluation,  the  actual  resource  allocations 
produced  by  planners  using  the  AESOP  system  were  compared  with 
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those  produced  by  planners  using  a simulated  version  of  the  stan- 
dard system  in  an  inteqrated  series  of  tactical  exercises  depicting 
the  military  maneuvers  of  loyalists  and  insurgents  during  a ten- 
day  limited  war.  Experimental  sessions  began  with  briefings  re- 
lating to  orders  of  battle,  political  and  military  activity,  and 
Joint  Task  Force  requests  for  support  of  loyalist  objectives. 

AESOP  and  Manual  Planning  teams  then  adjourned  to  commence  allo- 
cation activities  in  response  to  the  simulated  JTF  requests.  The 
experiment  ended  each  day  with  the  (automated  or  manual)  produc- 
tion of  squadron  Fragmentary  Order^. 

Each  planner,  whether  operating  the  manual  or  AESOP  system, 
was  required  to  generate  an  allocation  which  represented,  in  his 
judgment,  the  best  tradeoff  among  four  criteria  (listed  in  de- 
creasing order  of  importance) : 

1.  Satisfaction  of  requested  level  of  damage 

2.  Satisfaction  of  requested  time  over  target 

3.  Minimization  of  use  of  recycled  aircraft 
(i.e.,  of  sortie  rate) 

4.  Minimization  of  (total)  flying  time. 

The  results  of  the  evaluation  study  contained  few  surprises. 

In  those  aspects  of  planning  activity  for  which  AESOP  provided 
direct  assistance,  performance  of  those  using  the  system  was  su- 
perior. In  those  aspects  for  which  assistance  was  not  provided, 
planners  in  the  two  systems  performed  at  approximately  equal  levels. 
The  net  performance  of  AESOP  planners  was  superior  to  that  of 
manual  planners  with  respect  to  plan  quality  and  production  ef- 
ficiency, a finding  that  must  be  assessed  in  light  of  the  fact 
that  the  larger  portion  of  the  task  was  fairly  routine  and  required 
little  creative  ability. 

It  is  important  to  note  that  the  AESOP  system  provided  no 
formal  procedural  aids  to  the  decision  maker  such  as  decision  al- 
gorithms, linear  programming  solutions,  etc.  What  benefits  accrued 
to  users  of  the  system  during  the  more  creative  phases  of  their 
task  seemed  to  result  from  a combination  of  indirect  factors.  It 
appeared  to  be  the  case,  for  example,  that  planners  could  more 
easily  comprehend  the  extent  to  which  resources  would  be  "strained" 
and,  thereby,  develop  a better  "feel"  for  the  nominal  form  of  their 
plan  prior  to  its  production.  This  appreciation  for  the  difficulty 
of  the  problem  with  which  they  were  faced  on  a particular  day  was 
materially  aided  by  the  concise  nature  of  the  displays  provided 
by  the  system.  Planners  who  used  the  system  were  in  a much  better 
position  to  monitor  their  own  progress  while  solving  the  problem 
than  were  those  whose  appreciation  of  the  demands  of  the  situation 
had  to  be  assembled  from  groups  of  formal  documents. 
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It  appeared  also  to  be  the  case  that,  since  the  system  per- 
formed routine  aspects  automatically,  more  time  was  available  for 
creative  activities  and  for  reiteration  of  plans.  On  several 
occasions  AESOP  planners  attempted  successfully  to  produce  series 
of  allocations  of  progressively  greater  merit  and  stopped  only 
when  they  were  totally  satisfied  with  their  efforts.* 

13.4.2.2  TACTRAIN 

The  Tactical  Training  (TACTRAIN)  facility  was  developed  by 
the  Electric  Boat  Division  of  General  Dynamics,  partly  as  a 
demonstration  that  a modest  computer  with  a CRT  display  could 
be  employed  in  the  training  of  decision-making  skills  and  partly 
as  an  experimental  tool  for  evaluation  of  alternative  tactical 
display/interrogation  formats.  Details  regarding  computer  and 
display  equipment,  software  and  tactical  problem  parameters  used 
in  the  system  and  a specific  configuration  employed  nr  system 
evaluation  are  discussed  by  Sidorsky  and  Simoneau  (1973).  The 
summary  presentation  below  draws  heavily  on  their  discussion. 

The  TACTRAIN  system  provides  an  opportunity  for  the  decision 
maker  to  take  on  the  role  of  a commanding  officer  of  a submarine 
on  an  ASW  search-and-destroy  mission.  His  specific  task  is  to 
maneuver  in  such  a way  that  he  simultaneously  maximizes  the  prob- 
ability of  destroying  a simulated  enemy  ship  and  minimizes  the 
probability  that  the  enemy  ship  will  destroy  him.  He  chooses  a 
maneuver  by  selecting  a speed,  a depth,  a firing  range,  and  a 
quantity  of  torpedoes,  each  from  among  five  alternatives.  The 
choices  are  constrained  to  be  consistent  with  the  operating 
characteristics  of  own  and  enemy  ships,  the  parameters  of  own 
ship's  weapons  complement,  and  specific  sound  channel,  topo- 
graphic and  bathythermal  conditions.  The  maneuver  implied  by 
the  alternatives  that  are  chosen  is  then  evaluated  with  respect 
to  each  of  four  criteria:  (1)  the  probability  that  own  ship  would 

be  able  to  detect  the  enemy  ship,  (2)  the  probability  that  the 
enemy  ship  would  be  able  to  detec t own  ship,  (3)  the  probability 
that  own  ship  would  be  able  to  Destroy  the  enemy  ship,  given  the 
maneuver  and  weapon  characteristics,  and  (4)  the  probability  that 
the  enemy  ship  would  be  able  to  destroy  own  ship. 

While  solving  a particular  tactical  problem,  the  officer 
can  retrieve  information  stored  in  the  system  by  interrogating 
the  display  with  a light  pen.  Appropriate  interrogations  lead 


*A  quasi-linear  piogram  to  aid  this  strategy  at  a formal  level 
was  later  Developed  by  Feehrer  (1988)  for  the  AESOP  TACC  planning 
activity. 
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to  one  of  two  categories  of  display:  (1)  prior  to  a "command  de- 

cision," graphic  displays  of  the  "tactical  effectiveness"  asso- 
ciated with  the  choice  of  a particular  alternative  on  each  tac- 
tical dimension  (speed,  range,  etc.)  with  respect  to  each  of  the 
four  criteria--available  prior  to  a command  decision,  and  (2) 
alphanumeric  displays  revealing  the  outcome  of  the  maneuver,  the 
number  of  (quality)  points  to  be  assigned  to  the  outcome,  and 
the  cumulative  number  of  points  acquired  as  of  the  end  of  the 
experimental  trial  in  question--available  following  a command 
decision. 

The  developers  of  TACTRAIN  see  it  making  at  least  two  valuable 
inputs  to  the  learning  process  of  the  decision  maker.  First,  it 
provides  immediate  knowledge  of  the  consequences  of  a decision. 

The  decision  maker  discovers  very  quickly  whether  he  destroyed 
the  enemy  ship  and  whether  his  own  ship  was  destroyed  in  the 
process.  Moreover,  he  is  provided  with  an  arithmetic  u.easure, 
however  arbitrarily  derived,  of  his  cumulative  performance. 

Second,  the  decision  maker  is  provided,  via  the  display, 
with  a graphic  portrayal  of  the  interactions  between  tactical  and 
environmental  variables  and  their  relationship  to  tactical  ef- 
fectiveness as  represented  by  detection/counter  detection  and 
hit/miss  outcomes.  And,  inasmuch  as  the  tactical  problem  unfolds 
over  time,  the  decision  maker  also  gains  an  appreciation  for  the 
changing  complexities  of  these  interactions  and  for  the  need 
for  timeliness  in  his  decision. 

13.4.3  Computer-Based  Decision  Aids  and  Training 

It  seems  highly  probable  that  many  attempts  to  develop  com- 
puter-based decision  aids  will  fail  in  the  sense  that  the  aids 
that  are  produced  will  not  measure  up  to  the  expectations  of 
their  developers.  This  is  not  necessarily  failure  in  a larger 
view,  however,  if  these  attempts  lead  to  a better  understanding 
of  the  decision-making  process — as  one  might  reasonably  hope  that 
they  will.  To  the  extent  that  these  efforts  do  lead  to  new 
insights  into  various  aspects  of  the  decision-making  process,  they 
will  have  direct  impact  on  training  curricula. 

To  the  extent  that  specific  systems  prove  to  be  effective 
aids  in  operational  situations,  they  will  constitute  new  tools 
with  which  decision  makers  will  have  to  work.  Thus,  their 
existence  will  represent  a new  traininq  need,  namely  the  need 
to  train  the  users  of  these  aids. 

Perhaps  the  most  challenging  way  in  which  the  development  of 
increasingly  sophisticated  computer-based  systems  relates  to 
training  is  in  the  potential  that  these  systems  represent  for 
provid ing  training  for  their  users.  Critics  of  the  idea  of 
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computer-assisted  instruction  can  correctly  point  out  that  the 
results  of  endeavors  in  this  area  have  not  measured  up  to  the 
expectations  that  were  fostered  by  many  of  the  early  enthusiasts 
for  this  use  of  computers.  Very  real  progress  in  the  area  is 
being  made,  however,  and  it  may  prove  to  be  the  case  that  the 
early  enthusiasts  erred  only  in  failing  to  appreciate  the 
difficulty  of  some  of  the  problems  that  had  to  be  solved  and  the 
time  that  would  be  required  to  solve  them.  There  is  no  question 
but  that  computer  systems  that  are  intended  to  be  used  by  people 
interactively  on  complex  problem-solving  tasks  can  be  given  the 
capability  to  provide  much  of  the  training  that  is  required, 
both  to  initiate  users  and  to  bring  users  from  neophyte  to  expert 
status.  The  potential  gains  to  be  realized  by  building  such 
training  capabilities  into  operational  systems  suggest  that  this 
possibility  is  worth  far  more  attention  than  it  has  yet  received. 
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